INDIANA UNIVERSITY 
NORTHWEST 


LIBRARY 


NIELS BOHR 
AND THE DEVELOPMENT 
OF PHYSICS 


Digitized by the Internet Archive 
in 2023 with funding from 
Kahle/Austin Foundation 


https://archive.org/details/nielsbohrdevelop0000wpau 


NIELS BOHR 


ANDTHE DEVELOPMENT 
Shepipy StCs 


Essays dedicated to Niels Bohr 


on the occasion of his seventieth birthday 


Edited by W. PAULI 
with the assistance of 
L. ROSENFELD and V. WEISSKOPF 


PERGAMON PRESS 
NEW YORK ¢ LONDON e PARIS 


1955 


First published 1955 


INDIANA 
UN.: cassis! 


LiJsRARY 


rr ¥VaT7 oli poate. ond d 
U.LA, a Hill wok Co., Inc., 
330 WEST FM ork 36, N.Y. 
Pritted t FN] PA GIRBY th Piinlay ss, Rath, Somerset 
WOUTHWEST 
LIBRARY 


‘ tN 


CONTENTS 


C. G. Darwin (Cambridge): The discovery of atomic number 


W. HEISENBERG (Gottingen): The development of the inter- 
pretation of the quantum theory 


W. Pautr (Zurich): Exclusion principle, Lorentz group and 
reflexion of space-time and charge 


L. D. LANDAU (Moscow): On the quantum theory of fields 
L. ROSENFELD (Manchester): On quantum electrodynamics 
O. KLEIN (Stockholm): Quantum theory and relativity 


H. B. G. Casimir (Eindhoven): On the theory of supercon- 
ductivity 


F. L. FRIEDMAN and V. F. WelsskopF (Cambridge, Mass.): The 
compound nucleus 


J. A. WHEELER (Princeton): Nuclear fission and nuclear 
stability 


J. LinpHARD (Copenhagen): On the passage through matter 
of swift charged particles 


PAGE 


134 


163 


185 


PREFACE 


This book nee been compowsd to pay Sis Bone 2 tribute of admira- 
‘OG 2nd gratitude on the owaaon of he weventieth birthday. For thi, 
comvestowal fancnon w: have tried to produce an unconventional 
Festschrift. Vhe weemive Chapters. ¢ach written by one of Bonn’s 
GE oF poenger oRlaboravere and friends from many countrics, evoke 
ReIne eapect of tne genera) problems of physics with which he has been 
rum Geely comerned. Such a plan inewitably entails a vormewhat 
oterary Action. dnd, dias, there are other gaps in it which crucl; 
rermnd a: of frends 100 s00n Goparted. Imperfect as it is, our choice of 
. nets afiordhs 2 fcirly connected survey of the devclopment of atomic 
end mudicer prwtice mn ite mow walent features: 2 result in no way 
werprinmng for the workere in theve fitids, who ore wont first and fore- 
wart to look to Bowe for guidance and inepiration, and whose warmest 
WIA it ie Ww enjoy his benevolent Icederehip in the common quest for 
many years to come. 


THe Epirors 


THE DISCOVERY OF ATOMIC NUMBER 
C. G. Darwin 


THE first thirty years of the present century will surely rank as the second 
heroic age of theoretical physics, rivalling if not surpassing the great age 
of NEwTon. This volume is a tribute in honour of one of the heroes of 
this second age and it may serve as a fitting introduction to it if I make 
a short review of some of the events in the early part of the period with 
which he was specially associated. These were the discoveries that were 
being made in Manchester from about 1912 to 1914, during much of 
which time Bour was there. 
Some of the great discoveries which have advanced science may be 
called “easy” ones, not in the sense that they were at all easy to dis- 
cover, but that once made they are easy for everyone to understand. 
They can command universal acceptance without further argument, 
and do not demand any deep or radical revision of processes of thought 
in order to apprehend them. An example of such an “easy” discovery 
during the first heroic age is COPERNICUS’s recognition that the earth is 
not the centre of the universe. Once this theory was established, the 
consequences were so simple that there would be no need for anyone to 
look back to the period before it. The scientist—but not of course the 
historian of science—could almost afford to forget about the existence 
of CopERNICUS. On the other hand NewTon’s laws of mechanics and of 
gravitation have none of this easy quality, and no one who wished to 
master mechanics could possibly afford to forget about NEWTON. In the 
second heroic age there is roughly a parallel to this. Corresponding to 
NEWTON’s mechanics there are the great developments of atomic 
mechanics and the quantum theory which call for a similar degree of 
expert understanding, but at the base of them there is also what I have 
called an ‘“‘easy”’ discovery, the discovery of Atomic Number. Once this 
had been established it was hardly necessary for anyone to look back 
to the period before it—again excepting the historians of science. The 
later chapters of this work will be dealing with the wonderful develop- 
ments of atomic dynamics that have been made since the beginning of 
the twentieth century, and an account of the discovery of atomic number, 
including the discovery of isotopes, may serve as a suitable introduction 
to these subjects, by one who was an eye-witness of much that led to it. 


I 


Pe C. G. Darwin 


It is seldom that there is anything entirely new about a scientific dis- 
covery; it has nearly always been adumbrated before, but it is often 
hard for later generations—who know all the right answers—in looking 
back, to assess the difficulties of their predecessors. The ground was 
usually encumbered with a lot of rival speculations, each of which 
involved some apparently serious objections to it, and it was hard for 
anyone at the time to judge which among these objections could be 
neglected and which were insuperable. Not the least of the difficulties 
that have to be faced at such a period can be attributed to what may be 
called the human time-scale, which dictates that a man forms his main 
opinions between the ages of 20 and 30, and though in later years he 
may adapt himself to newer ideas and may still himself make important 
contributions, yet he feels a greater resistance to the new things, so that 
usually they will appear to him intrinsically more difficult than the old 
ones. The Jiterature of science is full of examples. Thus no one could 
deny that RAYLEIGH kept in the van of scientific progress almost to the 
end of his life, and yet as late as 1899 there is a quite surprising state- 
ment in one of his papers [I]. Some twenty-eight years earlier he had 
written his first celebrated paper on the light of the sky, and at that 
time he had used the still largely accepted elastic-solid theory of light. 
In 1899 he took up the matter again, and he again used the elastic-solid 
theory, but simply in order to make this paper match with the earlier 
one. He could recognize that for his purpose it was just as good as the 
electromagnetic theory, and that only trivial alterations would be 
needed to make it fit that theory. But in introducing these alterations 
he says: “In the electric theory, to be preferred on every ground 
except that of easy intelligibility. . . .” Would anyone who had grown 
up in even the very next scientific generation after him have found 
the elastic-solid theory as easily intelligible as the electromagnetic? 
With this sort of thing in mind, it is rather important to pay some 
attention to the mental climate of the periods preceding any great 
discovery. 

A main thing that has to be allowed for in the period before radio- 
activity had been discovered was the very much more tepid faith that 
everyone had in the atomic theory. There was not the absolute confi- 
dence in its truth that came later; indeed many chemical textbooks of 
the nineteenth and even the earliest twentieth century tended to present 
DALTON’s theory almost apologetically as a highly plausible and con- 
venient, but quite unverified, hypothesis. The faith of the chemists in 
atoms was comparatively weak though few of them would have gone as 
far as OSTWALD, who as late as 1900 propounded the view that the law 
of combining weights could be explained without using atoms at all. 
The physicists during the nineteenth century for the most part erred in 


The Discovery of Atomic Number 3 


a different direction; they accepted the atoms, but were not much 
interested in them. Thus FAarapaAy’s study of electrolysis gave evidence 
for an atom of electricity which had almost exactly the same force and 
quality as DALTON’s evidence for the other atoms, and yet no one really 
thought much about it. Indeed it is quite startling to read a sentence in 
MAXWELL’s book on electricity, written in 1873. Certainly nobody 
could suspect MAXWELL of a disinclination to believe in atoms, and yet 
after explaining FARADAY’s law he writes [2]: “It is extremely improbable 
that when we come to understand the true nature of electrolysis we 
shall retain in any form the theory of molecular charges, for then we 
shall have obtained a secure basis on which to form a true theory 
of electric currents and so become independent of these provisional 
hypotheses.” 


The fact of the matter is that at that time and for long after, the 
atomic theory, in spite of the promising start given to it by MAXWELL, 
was regarded as a relatively unimportant corner of the field of physics. 
The centre of interest, under the stimulus of FARADAY, and of MAXWELL 
himself too, was the electric field, which they endowed with a very 
objective reality in the form of lines of force; such matters—perhaps 
specially in England—tended to dominate the thoughts of the physicists. 
These things did indeed have considerable influence in important 
branches of physics; thus the study of the mechanism of the aether, 
and the attempt to reconcile it with stellar aberration, did ultimately 
help in leading towards the theory of relativity. But after MAXWELL’s 
successes in gas theory, especially in predicting and verifying the law 
of viscosity, it is rather surprising to find how little attention was paid 
to the subject by the general run of physicists. It was recognized that 
Gripes and BOLTZMANN had done good work, but it would very much 
have surprised most people, if they had been told that soon statistical 
theory would become the central subject of physics. 


One thing which diminished people’s concern with atoms may have 
been the uncertainty about their sizes and the numbers of them. It is 
illogical perhaps that such a detail should have this effect, but it is 
humanly very natural. The first reasonably reliable estimate was made 
by MAXWELL with the use of his theory of viscosity. He gave it as 
19 x 1018 molecules in a cubic centimetre [3], which may be compared 
with the now known value of 27 x 1018. In view of the rather specialized 
theories he had to make about atomic collisions in his theory of vis- 
cosity, this value was surprisingly good, but its accuracy long remained 
very uncertain. RAYLEIGH, in the paper cited on the light of the sky, 
confirmed it, and guaranteed that it could not be less than half this, 
because if it had been smaller the air would not be as transparent as it 


4 C. G. DARWIN 


in fact is; but still the value was not at all well fixed. With the dis- 
covery of x-rays and radioactivity the atomic age started, and it 
promised to give much better values, but in fact the earliest ones were 
not very good either. The first method was a brilliant piece of work by 
TOWNSEND, then working in Cambridge, and its essential feature was 
the rate of fall of a cloud, though not yet the cloud of a Wilson chamber. 
He had also to make other delicate measurements, and in the circum- 
stances it is surprising that his work yielded as good a value as it did 
for the charge of the electron; it was only 40 per cent too small. Later 
methods at Cambridge used the Wilson cloud chamber, in which only 
the negative ions formed cloud drops; this work was essentially on the 
lines later perfected in 1910 by MILLIKAN, and it gave an answer also 
about 40 per cent too small. I can only find one record of a notably 
good value before the time of MILLIKAN, and this was derived by 
PLANCK from his radiation formula [4]. He fitted the experimental 
results to his formula and so derived h and k, and from k he derived the 
electronic charge as 4-69 x 10-1 e.s.u. This value is astonishingly 
good, but we must make some allowance for the caution people would 
have felt in accepting it, because it is hard for anyone to feel perfectly 
confident about one of two constants in a formula when the other one 
is completely unknown and mysterious. 


In the period immediately preceding the discovery of atomic number 
there were two important contributions made to the general subject. 
One was Soppy’s proposal of the idea of isotopes [5]. He and other 
workers had failed to separate ionium from thorium by any chemical 
means, and they proposed the principle that they were chemically 
absolutely inseparable. At that time such a proposal might well be 
regarded with some caution, because it was not so many years since the 
last of the rare earths had been laboriously separated from one another, 
and so it could not be absolutely excluded that still:more laborious 
methods might separate the radio-elements. But there was a strong 
objection against this possibility because most of the radio-elements lie 
in the part of the periodic table near the inert gases, and the elements in 
this part of the table have characteristics that are rather predictable, 
and they should be very easily separable from one another. Moreover 
in this part of the table there is simply no room for more elements, and 
so the case for the existence of radio-active isotopes could be accepted 
as being very strong. Certainly many people must have considered the 
possibility of inactive elements also having isotopes, but most chemists 
would have rejected it absolutely on the ground that much extremely 
accurate work had invariably given a constant value for each atomic 
weight; if chlorine from any source whatever always gave 35-46 it was 
felt that this must be the weight of every chlorine atom. 


The Discovery of Atomic Number 5 


The second suggestion leading towards the idea of atomic number 
was the work of J. J. THOMSON enquiring into the number of electrons 
in the atom [6]. At the time when he had discovered the mass of the 
electron to be so small, THOmsoN, largely on account of his own 
important early theoretical studies, was attracted to the idea of mass 
being a purely electromagnetic phenomenon; the electron was simply 
to be an electric field anchored to some sort of massless origin. It was 
hoped then that all mass would prove to be electromagnetic, but there 
was difficulty about the positive charges of atoms which would be 
needed in order to neutralize the electrons. He had assumed a distri- 
buted positive charge, partly because only with some such assumption 
would the atom have a definite size, but a charge of these dimensions 
would have practically no electromagnetic mass. It is not very clear 
what was thought about this, but at least in the writings of some of the 
authors of that period there is the hint that the mass of the atom must 
be due to the masses of its contained electrons; in fact there would be 
1800 electrons in a hydrogen atom. However, this improbability was 
not accepted seriously by anyone. THOMSON attempted to estimate the 
number of electrons in the atom directly. He used three methods and 
arrived at avowedly rough results which indicated that the number 
should be proportional to the atomic weight and numerically rather less 
than it. The first estimate was based on BARKLA’s work on the scatter- 
ing of x-rays, the second on the scattering of f-rays in passing through 
gases, and the third was derived from the theory of optical refraction. 
In the light of the quantum theory of spectra the third method may be 
rejected as unsound in every respect, and the first, though in fact it 
yielded the value of half the atomic weight, is open to criticism for a 
variety of reasons including the absence of consideration of the Compton 
effect which of course was wholly unknown at that time. In the study 
of the scattering of 6-rays there were a number of rather doubtful 
approximations for which it may also be criticized, but the method was 
sound in principle and it yielded answers proportional to and rather 
less than the atomic weights of air and the other gases considered. 


These were the preliminaries known in 1911, when the subject came 
to the front in the Manchester laboratory. It was of course the scatter- 
ing of the «-particles that gave the final impulsion. The «-particle was 
always RUTHERFORD’s favourite. He could see that its great mass and 
its great energy made it the most effective of all probes to show what 
was in the atom. It had the advantage of giving an easily visible 
scintillation, and at that time this was very much the easiest way of 
detecting the effect of a single atom—in spite of the almost intolerable 
tedium to the observer of having every day to spend a long time getting 
his eyes accommodated to the dark. The first work by GEIGER and 


6 CoG! DARWIN 


MaRrSDEN [7] examined the scattering of a pencil of a-rays in passing 
through a metal foil, and the result was considered in terms of “‘com- 
pound scattering,” that is of the cumulative effect of the passage past a 
large number of atoms. The result bore out the previous evidence that 
the number of electrons in each atom was of the order of magnitude of 
its atomic weight, but of course this was not the important thing found. 
It was seen that there were a few a-particles scattered through such 
broad angles, even right backwards, that no conceivable compound 
effect could possibly explain them, but that there must be some different 
cause. 


RUTHERFORD recognized that there must be an effect of “‘single 
scattering,” and for a time he would merely say that there were the 
most tremendous forces somewhere in the atom. But in 1911 he tried 
the idea of a heavy central electric charge repelling the «-particle—it 
was I think several months before it was called the nucleus—and at 
once the whole theory of the nuclear atom emerged. It was most 
interesting to see his methods of thought. He had done Jittle mathema- 
tics since his school days, and his knowledge of hyperbolas was essen- 
tially that of a schoolboy, but it was exactly what he needed and it led 
him perfectly straight to his famous law of scattering [8]. It is interesting 
to record how at this very early stage, even before the theory was verified 
at all, he could see many further steps it might lead to. Thus it was 
pointed out to him that the collisions he had been considering were 
with heavy atoms, and that there would be interesting results if he 
thought about the collisions with hydrogen or helium, because the light 
nucleus would itself be knocked forward at a high speed. He accepted 
the point immediately as being important, but characteristically he at 
once took it a step still further. The strong electrostatic forces of repul- 
sion prevent the «-particle ever getting close to a gold nucleus, but not 
so with the nucleus of a light atom, and he at once saw that it would be 
these experiments with light atoms which alone would yield information 
as to the shape and structure of the nucleus itself, about which he was 
already thinking. 

However, all this depended on the verification of the main hypothesis, 
and this verification was undertaken in a laborious series of experiments 
by GEIGER and MARSDEN [9]. The results were brilliantly successful in 
every respect, and they yielded about 100e for the charge of the gold 
nucleus, and a value about ten times less for the nucleus of aluminium. 
Years later the work was repeated by CHADWICK, and he obtained 
results giving values quite close to the atomic numbers of the elements, 
which by then had been fully established. 


At this stage the whole Manchester laboratory believed in the 


The Discovery of Atomic Number 7 


undoubted existence of atomic number, defined as the nuclear charge. 
The idea of what the number meant was perfectly precise, but the value 
of it could not yet be known precisely. Thus the accuracy of the experi- 
ments would admit of gold being 79 instead of 100, but it seemed not 
excluded that there might be hitherto unknown gaps in the periodic 
table which would bring the number closer to its estimated value. 
There remained outstanding the question of fractional atomic weights, 
to which I shall return later, but it may be noted that whichever way 
that result had gone there was still complete justification for accepting 
the principle of atomic number. To those working at the time in 
Manchester the work on nuclear scattering made the idea quite con- 
vincing and it was fully accepted there, though in fact the proposal of 
the principle of atomic number was first published rather later else- 
where [10]. Its general acceptance in the outer world had to await con- 
firmatory evidence from other sources. 

The first of these came very quickly from Bour, who had arrived in 
Manchester about this time. He had a deeper insight into the basic 
principles of physics than anyone else there, and he applied it to a 
variety of problems. He naturally at once fastened on the nuclear 
theory as the most important thing and could see better than others 
not only its virtues but also the difficulties it would involve. Chief 
among these was the point that dynamical principles could provide no 
scale of length to determine how big the atom would be; in this it was 
unlike J. J. THOMSON’s atom with its field of distributed positive 
electricity. This suggested to him that the size might depend on the 
quantum [11], and his application of the idea was much more revolu- 
tionary than it may appear now because it involved an entirely new use 
of the quantum*. Up to that time the quantum had always been 
associated with energy, both in PLANCK’s theory and in the theories of 
EINSTEIN and Depye of specific heats. The quantization of angular 
momentum was an entirely new thing, and his success with it was of 
course absolute. There can be few other cases in science where a theory 
has been made which succeeds in yielding a particular number—here 
RyYDBERG’s constant—from quantities all of which are known, without 
the admissibility of any adjustable constant to help in doing so. 

It is the prediction of the hydrogen spectrum that was the outstanding 
success, and that may now seem a fairly simple thing, but it is proper 
to remember that at the time spectroscopy was by no means in the 
clear-cut state in which it could be presented later. To the spectro- 
scopists it was not even clear that hydrogen had a simpler spectrum than 


* A partial use of the quantum in this sense had been proposed already by NICHOLSON, 
Mon. Not. Roy. Astr. Soc. 72, 679, 1912. 


8 C. G. DARWIN 


the alkalis, and indeed they gave much more attention to these other 
elements. Moreover there was the confusion that the Balmer series was 
3 > 2, 4-2, etc., while the Lyman series of 2 — 1, 3 — 1, etc., in the 
far ultra-violet was not yet known. A!l these matters of spectroscopy 
were only familiar to the specialists, and they were barely apprehended 
by most general physicists, and yet BonR seemed to know them all in 
advance. There were also other difficulties confronting him. First there 
was the small discrepancy of the hydrogen RYDBERG number compared 
with that coming from the alkalis; this he could explain as due to the 
smaller mass of the nucleus. Also some time earlier the astronomers 
had reported that the star Zeta Puppis emitted a spectrum which 
seemed to be a kind of half Balmer series 24 -> 2, 3 — 2, 33 — 2, ete., 
and this might at first have appeared to ruin his theory. But Bour at 
once could point out that it was exactly the spectrum that ionized 
helium would produce; the thought that it might be helium had 
occurred to nobody else, because the spectrum was of course so different 
from the known helium spectrum. 


The explanation of the hydrogen spectrum must rank with the dis- 
covery of the nucleus as one of the greatest triumphs of physics. BOHR 
next followed up its further implications and explained such matters as 
the occurrence of the RYDBERG number in other spectra. He also 
explained the ZEEMAN effect, though here, and with the doublet spectra 
of the alkalis, he not unnaturally got into difficulty over matters which 
were only cleared up long afterwards with the discovery of electron 
spin. He also attempted to explain the more detailed structure of some 
of the other atoms, but in this he was not so successful—naturally, 
since at the time there was practically no quantum mechanics and 
in particular no exclusion principle. Nevertheless these further develop- 
ments did make it clear that the visible hydrogen spectrum should 
have its analogy in the x-ray spectra of the heavier elements, and this 
led to the step which finally verified the reality of atomic numbers. 


It was of course the work of MosgLEy that accomplished this. He 
departed from the laboratory’s almost exclusive devotion to radio- 
activity, and took up and developed some important points in the quite 
new science of x-ray spectroscopy. He then applied his technique to 
the discovery of the wavelengths of the characteristic x-rays, which had 
been studied by BaRKLA [12] several years before. Working day and 
night by himself with a very characteristic excess of énergy, and in spite 
of constantly pulling his apparatus to pieces in order to improve it, he 
quickly got his main results. These first results were embodied in a 
famous paper [13] giving the wavelengths of the x-ray K spectra of the 
elements from calcium to zinc. It showed that they fell into a definite 


The Discovery of Atomic Number § 


sequence of ordinal numbers, and so established firmly the reality of 
atomic number. Afterwards he followed up this work while at Oxford, 
by getting the spectra of the majority of the elements [14]. In such a 
wide range it was natural that variations of method had to be used, and 
for the heavier elements it was the L, not the K, radiation that he studied. 
The result was an arrangement of the whole table of the elements in 
ordinal number; there were “‘screening constants” in the formulae which 
prevented the direct immediate setting down of the cardinal number for 
each element, but from the cumulative evidence these were also 
unmistakable. Thus with this work every known element could have 
its number assigned, and the few missing ones, number 43, etc., could be 
determined. 

With Mosetey’s work the fixing of atomic numbers for all the 
elements was complete, but there remained outstanding the question 
of the fractional atomic weights, which called for explanation, because 
there was such very definite evidence, dating from the time of Prout, 
that many of the weights were whole numbers, and yet some of them 
very definitely were not. The question then arose whether all the 
elements were composed of isotopes each having a weight of some 
integer. The first direct evidence of this came from J. J. THoMson’s 
work with positive rays*. Whereas the atomic weight of neon is 20-2, 
using the “parabola method” he got a line at 20 and a weaker one at 22. 
This appeared to be strong evidence, but it could not be regarded as 
conclusive because the parabola method has no high accuracy, and also 
because many elements in a positive-ray tube form temporary hydrides, 
and it might have been argued by a skeptic that the line at 22 was a 
dihydride of neon. With a view to resolving this matter in 1913 ASTON 
set up a diffusion apparatus in the hope of separating the constituent 
gases, but he found, as many others have done, that however inescap- 


_ able the results of diffusion ought to be in theory, in practice they are 


apt to be very disappointing. Though a little separation was achieved, 
it refused to go very far, and there for a time the matter rested with only 
the strong probability that there were two neon isotopes. In the mean- 
while quite separate evidence pointing in a similar direction had arisen 
from another source. Lead is the end product of all the radio-elements, 
_and therefore the atomic weight of lead derived from uranium should 
differ from that of lead from thorium. Several experts set to work to 
determine the atomic weight of lead from various sources and they did 
get unmistakable differences in their results. Thus the older doctrine 
that all the atoms of an element weighed the same was destroyed by this 
, example to the contrary. 


* See Aston’s Isotopes, p. 28. 


2 


10 C. G. DARWIN 


In 1919 AsTON took up the subject again, making use of the method 
of positive rays, and designed the first mass-spectrograph. This was 
designed with the greatest attention to detail, including the study of the 
focussing of the beam of rays-—-the focussing of atomic rays was at that 
time almost an unexplored field—and it was constructed with the high 
artistry for which he was distinguished. It at once cleared up the whole 
question of isotopes. For example he very soon showed that chlorine 
had isotopes at 35 and 37, though it is proper to say that it was not quite 
simple, because there were at first hydrides at 36 and 38 also, and special 
devices had to be used to clarify which of the rays were atomic and which 
molecular. He very soon worked out the majority of the elements, and 
just as MoseLey had given nearly all the atomic numbers, so ASTON 
gave nearly all the isotopes, or at any rate nearly all those which occur 
with any sort of abundance. When he had completed this work he 
turned his attention to the fractional mass-defects of the elements, and 
got important results, but that part of his work falls rather outside the 
present subject. 


With the determination of the atomic numbers of the elements and 
the atomic weighis of their isotopes PRouT’s original hypothesis was 
confirmed, that the atoms are all built of a single weight of brick. Indeed 
it has been elaborated in a manner that he could never have foreseen 
because not only were the bricks measured in the same unit of mass, 
but also in the same unit of electric charge. But one later discovery, 
that of the neutron by CHADWICK, also fits into the pattern, because it 
has proved so much simpler to construct nuclei out of protons and 
neutrons than out of protons and electrons. This then completes the ~ 
account of what I have called an “‘easy”’ discovery, though it will have 
been seen that its discovery was by no means easy to make. Whatever 
the future may bring forth in the way of modifications of our under- 
standing of the laws of mechanics, it seems impossible to believe that 
there can ever be any change in the now established principles of 
atomic weight and atomic number. 


REFERENCES 


[1] RAYLEIGH, Lord; Phil. Mag. XLI, 107, 274, 1871; Phil. Mag. 47, 375, 1899, 

[2] MaxweLL; Electricity and Magnetism, Vol. 1, peolsy 

[3] MAxweELL; Nature, Aug. 1873. : 

[4] PLanck; Vorlesungen Uber Warmestrahlung, p. 163, 1906. 

[5] Soppy; Chemistry of the Radio Elements, 1911. 

[6] THoMson; The Corpuscular Theory of Matter, Chap. VI, 1907. 

[7] GEIGER & MARSDEN; Proc. Roy. Soc. A 82, 495, 1909. Also GEIGER; Proc. 
Roy. Soc, A 83, 492, 1910. 

[8] RUTHERFORD, Lord; Phil. Mag. 21, 669, 1911. 


The Discovery of Atomic Number 


[9] GeicER & MARSDEN; Phil. Mag. 25, 604, 1913. 
[10] VAN DEN Brork, Phys. Z. 14, 33, 1913. 

{11] Bour, N.; Phil. Mag. 26, 1913. 

[12] BARKLA; Phil. Mag. 17, 739, 1909. 

[13] MoseLey; Phil. Mag. 26, 1021, 1913. 

[14] MoseLey; Phil. Mag. 27, 703, 1914. 


11 


THE DEVELOPMENT @F GE 
INTERPRETATION OF THE QUANTUM 
THEORY 


W. Heisenberg 


I. THe fact that PLANCK’s quantum theory [1] would cause some 
changes in the foundations of physics must have been realized after 
EINSTEIN’s [2] work on light quanta in 1905, if not earlier. The quantum 
theory, nevertheless, developed for almost another 20 years without its 
principles being clarified, and the work of BoHR, KRAMeRS and 
SLATER [3] in 1924 was the first serious attempt to resolve the paradoxes 
of radiation into rational physics. In what follows we shall briefly 
outline the history of this clarification from 1924 to 1927, and we shall 
then inquire into the criticisms which have recently been made against 
the Copenhagen interpretation of the quantum theory. 


In 1924 Bonr, KRAMERS and SLATER asserted, first of all, that the 
wave propagation of light on the one hand, and its absorption and 
emission in quanta on the other, are experimental facts, which must be 
made the basis of any attempt at clarification, and not explained away; 
the fundamental consequences of this state of affairs must, therefore, 
be taken seriously. They therefore introduced the hypothesis that the 
waves are of the nature of probability waves: that they represent not a 
reality in the classical sense, but rather the “possibility” of such a 
reality. The hypothesis was that the waves defined the probability, at 
every point, that an atom present there is emitting or absorbing radia- 
tion. The absorption and emission were assumed to take place in 
quanta hy, It seemed to follow from this that the law of conservation 
of energy cannot be maintained in the individual processes, and Bour, 
KRAMERS and SLATER assumed that it holds only for the statistical 
average. 

Although the assumption that the energy conservation law does not 
hold for individual processes later proved to be incorrect (the relations 
were considerably less simple than could then be foreseen), the attempt 
at interpretation made by BOHR, KRAMERS and SLATER nevertheless 
contained some very important features of the later, correct, interpre- 
tation. The most important of these was the introduction of the 


12 


The Development of the Interpretation of the Quantum Theory 13 


probability as a new kind of “‘objective”’ physical reality. This proba- 
bility concept is closely related to the concept of possibility, the 
“potentia”’ of the natural philosophy of the ancients such as ARISTOTLE; 
it is, to a certain extent, a transformation of the old ““potentia”’ concept 
from a qualitative to a quantitative idea. On the other hand, the single 
quantum jump of BoHR, KRAMERS and SLATER is ‘‘factual” in nature; 
it “happens” in the same manner as an event in everyday life, or the 
deflection of a galvanometer. 


Somewhat later, BOTHE and GEIGER [4] showed experimentally by 
means of the Compton effect that the energy conservation law is valid 
for individual processes also. In the summer of 1925 quantum mech- 
anics was developed, and in the spring of 1926 SCHRODINGER’s wave 
mechanics, based on earlier work by DE BROGLIE, began to be evolved. 
The mathematical equipment of the new quantum theory was thus 
complete in its most important parts by the middle of 1926, but the 
physical significance was still extremely unclear. 


An important step forward was made by the work of Born [5] in 
the summer of 1926. In this work, the wave in configuration space was 
interpreted as a probability wave, in order to explain collision processes 
On SCHRODINGER’s theory. This hypothesis contained two important 
new features in comparison with that of BonR, KRAMERs and SLATER. 
The first of these was the assertion that, in considering “probability 
waves,” we are concerned with processes not in ordinary three-dimen- 
sional space, but in an abstract configuration space (a fact which is, 
unfortunately, sometimes overlooked even today); the second was the 
recognition that the probability wave is related to an individual 
process. The probability wave describes the behaviour, not of a large 
number of electrons, but only of one system of particles whose number is 
finite and is given by the number of dimensions in the configuration 
space; the wave can be conceived as representing a statistical assembly 
only in so far as the experiment concerned can be repeated as often as 
we please. This can be more exactly expressed as follows: the 
probability wave in a configuration space of 3” dimensions contains 
statistical statements about only one system of n electrons, which can 
for this purpose be imagined, as in Gibbs’ thermodynamics, as a 
sample selected arbitrarily from an infinite statistical assembly of 
identically constructed systems. 


Shortly afterwards, BorN’s hypothesis was extended and generalized 
by the following result [6], which had been obtained in the analysis of 
fluctuations. The interpretation of the diagonal matrix elements as 
time averages in matrix mechanics necessarily leads to the conclusion 
that the squared moduli |S,,|? of the elements of the transformation 


14 W. HEISENBERG 


matrix must be interpreted as the probabilities that the system will be 
found to be in the state b if it is in state a. Since SCHRODINGER had 
recognized that the wave functions were the elements of the transforma- 
tion matrices for the transition from energy states to position states, 
Born’s hypothesis formed a particular case of this more general assump- 
tion, which fitted naturally into the scheme of quantum-mechanics. 


Even then, however, a complete interpretation of the quantum 
theory had not been achieved, for the question remained how to define | 
the word “‘state’’ in the theory. A hydrogen atom in its normal state 
could be represented in the mathematical scheme of the theory. There 
were, however, entirely different states. For example, the track of an 
electron was seen in the cloud chamber. How should one represent in 
the theory an electron which is moving at a definite point with a definite 
velocity? 


Meanwhile (it was now the autumn of 1926), a quite new and 
different proposal for the interpretation of quantum theory had been 
made, which arose out of the development of wave mechanics. 
SCHRODINGER [7] attempted to deny entirely the existence of discrete 
energy values and quantum jumps, and to resolve quantum theory 
into a simple classical wave theory. The motive for this attempt was 
the discovery that the discrete eigenvalues appeared in wave mechanics 
not as energies, but as the eigenfrequencies of waves, and that the electric 
charge densities, when represented as products of waves, gave the correct 
radiation amplitudes. 


At the invitation of BouR, SCHRODINGER visited Copenhagen in 
September, 1926, to lecture on wave mechanics. Long discussions, 
lasting several days, then took place concerning the foundations of 
quantum theory, in which SCHRODINGER was able to give a convincing 
picture of the new simple ideas of wave mechanics, while BouR 
explained to him that not even Planck’s Law could be understood 
without the quantum jumps. “If we are going to stick to this damned 
quantum-jumping, then I regret that I ever had anything to do with 
quantum theory,” SCHRODINGER finally exclaimed in despair, to which 
Bour replied: “But the rest of us are thankful that you did, because you 
have contributed so much to the clarification of the quantum theory.” 
At any rate, wave mechanics had brought a new viewpoint, a new 
element of simplicity, into the quantum theory, which had to be 
incorporated into its interpretation. 


The months which followed SCHRODINGER’s visit were a time of the 
most intensive work in Copenhagen, from which there finally emerged 
what is called the ‘““Copenhagen interpretation of quantum theory,” 
and J remember with pleasure the exhaustive discussions with Bour, 


The Development of the Interpretation of the Quantum Theory ike 


often lasting till late at night, in which the usefulness of every new 
attempt at interpretation was by means of real or imagined experiments 
examined in the closest detail. Bour intended to work the new simple 
pictures, obtained by wave mechanics, into the interpretation of the 
theory, while I for my part attempted to extend the physical significance 
of the transformation matrices in such a way that a complete interpre- 
tation was obtained which would take account of all possible 
experiments. 


The clarification of these two approaches, at first sight apparently 
different, took place in the early part of 1927, when Bour had gone to 
Norway for several weeks on a skiing holiday. At this time BOHR 
developed the foundations of his idea of “‘complementarity,”” while I 
tried to solve the problem of how to pass from an experimentally given 
situation to its mathematical representation, by inverting the question, 
that is, by the hypothesis that only those states which can be represented 
as vectors in Hilbert space can occur in nature or be realized experi- 
mentally. This method of solution, concerning which I had an 
exhaustive correspondence with PAULI at the time, had its prototype in 
EINSTEIN’s special theory of relativity. EINSTEIN had removed the 
difficulties of electrodynamics by saying that the “‘apparent”’ time of the 
Lorentz transformation was the real time; he had assumed that Nature 
is such that the real time always corresponds to the letter ¢’ in the 
Lorentz transformation. Similarly, it was now assumed in quantum 
mechanics that real states can always be represented as vectors in Hilbert 
space (or as “‘mixtures’’ of such vectors). The uncertainty principle [8] 
was the simple expression for this assumption. 


Bour’s concept of complementarity [9] resulted in the same restric- 
tions to the applicability of classical concepts, owing to the appearance 
of quite different simple pictures which were “‘complementary,” and 
which could co-exist without contradiction only if their range of appli- 
cation was restricted. Some time later, BoHR’s view of complementarity 
found another, very impressive representation in the mathematical 
scheme of the quantum theory, when JORDAN, KLEIN and WIGNER [10] 
were able to show that, starting from a simple (three-dimensional) 
theory of material waves in SCHRODINGER’s sense, one could quantize 
this theory and so come back to the Hilbert space of quantum mechanics. 
The complete equivalence of the particle and wave pictures in the 
quantum theory was thus demonstrated for the first time, and SCHRO- 
DINGER’s viewpoint of a three-dimensional wave theory of matter had 
found its rigorous basis. The PAULI exclusion principle and the 
Bose statistics thereby also achieved their proper place in quantum 
theory. 


16 W. HEISENBERG 


From the spring of 1927, therefore, there existed at last a complete, 
unambiguous mathematical procedure for the interpretation of experi- 
ments on atoms or for predicting their results. The interpretation, too, 
contained the well-known statistical elements, which had appeared long 
since in the experiments (e.g. in a-decay, the photoeffect, etc.). 


In the autumn of 1927 the Solvay Conference took place in Brussels, 
and here the new interpretation of quantum theory was exposed to 
the most ingenious criticism, particularly on the part of EINSTEIN, 
and thereby received its crucial test. Again those experiments were 
discussed whose interpretation had always offered the greatest diffh- 
culties, and it became apparent over and over again that the new 
interpretation contained no internal contradictions, and clearly led to 
the correct experimental results. One of the finest documents of the 
discussions at this conference is BOHR’s article [11] for ALBERT EIN- 
STEIN’s 70th birthday. Since the Solvay Conference of 1927, the ““Copen- 
hagen interpretation” has been fairly generally accepted, and has 
formed the basis of all practical applications of quantum theory. It 
has, however, occasionally been contradicted and criticized as the 
“orthodox” theory. 


The criticism of the theory, which we shall discuss in detail below was, 
however, partly concerned with another side of the problem, which 
became important only as time went on. What was born in Copenhagen 
in 1927 was not only an unambiguous prescription for the interpretation 
of experiments, but also a language in which one spoke about Nature 
on the atomic scale, and in so far a part of philosophy. Indeed, the 
way in which Bonr had thought about atomic phenomena since 1912 
had always been something intermediate between physics and philo- 
sophy, and he had succeeded in explaining the periodic system of ele- 
ments from atomic theory only by combining fundamental inquiry with 
the practical problems of experiment. Thus he formulated the new 
interpretation of quantum theory in the philosophical language to 
which he had become accustomed by 15 years’ acquaintance with 
atoms, and which seemed best suited to the problems involved. This was 
not, however, the language of one of the traditional philosophies, 
positivism, materialism, or idealism; it was different in content, 
although it included elements from all these systems of thought. 


II. The criticism of the interpretation of quantum theory came at 
first from the older physicists, who were not prepared to sacrifice so 
much of the edifice of ideas of classical physics as was here demanded 
of them. EINSTEIN, SCHRODINGER and VON LAUE did not regard the 
new interpretation as conclusive or convincing. In recent years, how- 
ever, various younger physicists also have taken their stand against the 


The Development of the Interpretation of the Quantum Theory 17 


“orthodox” interpretation, and some have made counter-proposals, 
which we shall discuss below. 


The work of the opponents of the Copenhagen interpretation can be 
divided into three groups. 


The first and most numerous group takes over the interpretation of 
experiments from the Copenhagen theory without exception, at least 
in so far as experiments which have hitherto been performed are con- 
cerned, but it declares itself dissatisfied with the language used, i.e. the 
underlying philosophy, and replaces it by another. The work of ALEX- 
ANDROW [12], BLOCHINZEW [13], Boum [14], Bopp [15], DE BROGLIE [16], 
FENyEs [17] and WEIzEL [18] belongs to this group. 


The second group attempts actually to alter the quantum theory, so 
that the new theory, although it gives the same results as the old one in 
many cases, by no means does so in all cases. The best-constructed 
attempt in this direction is by JANossy [19]. 


The third group, finally, expresses rather its general dissatisfaction 
with the quantum theory, without making definite counter-proposals, 
either physical or philosophical in nature. The statements of EINSTEIN 
[20], VON LAUE [21], SCHRODINGER [22], and recently of RENNINGER [23] 
belong to this group. 


However, all the opponents of the Copenhagen interpretation do 
agree on one point. It would, in their view, be desirable to return to the 
reality concept of classical physics or, more generally expressed, to the 
ontology of materialism; that is, to the idea of an objective real world, 
whose smallest parts exist objectively in the same way as stones and 
trees, independently of whether or not we observe them. 


We shall explain once more, in section III of this essay, that this is 
impossible, or only partly possible, although we shall not be able after 
so many discussions of the problem to advance any new arguments. 
For the moment we shall subject the various counter-proposals against 
the Copenhagen interpretation to a short criticism; the details of the 
“orthodox”? quantum theory will be supposed known to the reader. 


(1) (@) Boum [14] tries to connect particle orbits with the waves in 
configuration space. DE BROGLIE also has recently taken up this idea 
to some extent. For Boum, the particles are ‘“‘objectively real’’ struc- 
tures, like the point masses of classical mechanics. The waves in 
configuration space also are objective real fields, like electric fields; 
but the question of course remains open whether configuration space 
is a “real’’ space. Only our uncertainty concerning the previous history 
of the system, and the properties of the measuring apparatus, are 
responsible for the statistical nature of our predictions. Bou has been 


18 W. HEISENBERG 


able to carry out this idea in such a way that the results for any experi- 
ment are the same as in the Copenhagen interpretation. The first 
consequence of this is that Bonm’s interpretation cannot be refuted by 
experiment, and this is true of all the counter-proposals in the first 
group. From the fundamentally “positivistic’” (it would perhaps be 
better to say “‘purely physical’’) standpoint, we are thus concerned not 
with counter-proposals to the Copenhagen interpretation, but with its 
exact repetition in a different language. The language used, however, 
is so different from the usual one that one at first supposes a difference 
in the physical assumptions also; thus, in BouM’s language, one must 
assert, as PAULI had already pointed out, that electrons in a stationary 
state without angular momentum are always at rest. This looks, at 
first, like a contradiction to experiment, because it is well known that 
any measurement of electron momenta gives the momentum distribu- 
tion |y(p)|?. To this, however, BouM can reply that the measurement 
itself can no longer be evaluated by means of the former laws; that a 
normal evaluation of the result of measurement would indeed lead to 
iw(p)|?, but that when the quantum theory (particularly the “quantum 
potentials” introduced ad hoc by Boum) for the measuring apparatus is 
taken into consideration, the conclusion that the electrons in a station- 
ary state are in “reality” always at rest is admissible. In addition, we 
find that the quantum potentials introduced by BoxM in this connection 
have very remarkable properties, e.g. they differ from zero at arbi- 
trarily great distances. At this price, BOHM considers himself able to 
assert: ““We do not need to abandon the precise, rational and objective 
description of individual systems in the realm of quantum theory.” 
This objective ‘description,’ however, reveals itself as a kind of 
“ideological superstructure,’ which has little to do with immediate 
physical reality; for the “hidden parameters” of BouM’s interpretation 
are of such a kind that they can never occur in the description of real 
processes, if the quantum theory remains unchanged. In order to 
escape this difficulty, Bom does in fact express the hope that in future 
experiments (e.g. in the range beyond 10-18 cm) the hidden parameters 
may yet play a physical part, and that the quantum theory may thus 
be proved false. Bour, however, is wont to say, when such hopes are 
expressed, that they are similar in structure to the sentence: “‘We may 
hope that it will later turn out that sometimes 2 x 2 = 5, for this 
would be of great advantage for our finances.” In actual fact, the 
fulfilment of Boum’s hopes would cut the ground from beneath not 
only the quantum theory, but also BoHM’s interpretation. Of course, 
it must at the same time be emphasized that the analogy just mentioned, 
although complete, does not represent a logically compelling argument 
against a possible future alteration of the quantum theory in the manner 


The Development of the Interpretation of the Quantum Theory 19 


suggested by BOHM. For it would not be fundamentally unimaginable 
that, for example, a future extension of mathematical logic might give a 
certain meaning to the statement that in exceptional cases 2 x 2 = 5S, 
and it might even be possible that this extended mathematics would be 
of use in calculations in the field of economics. We are, nevertheless, 
actually convinced, even without cogent logical grounds, that such 
changes in mathematics would be of no help to us financially. The 
author has therefore never understood how the mathematical proposals 
which BouM indicates as a possible realization of his hopes could be 
used for the description of physical phenomena. If we disregard this 
possible alteration of the quantum theory, then BouM’s language as we 
have already pointed out, says nothing about physics that is different 
from what the Copenhagen language says. There then remains only the 
question of the suitability of his language. Besides the objection 
already made, that in speaking of particle orbits we are concerned with 
a superfluous “‘idealogical superstructure,” it must be particularly 
mentioned here that Boum’s language destroys the symmetry between 
p and q which is implicit in quantum theory. |p(q)|? indeed denotes the 
probability distribution in position space, but |p(p)|? in his theory does 
not denote that in momentum space. Since the symmetry properties 
always belong to the intrinsic physical substance of a theory, it is 
difficult to see what is gained by omitting them in the corresponding 
language. 


The same objection applies to DE BROGLIE’s attempts [16] to introduce 
pilot waves; here also |y(q)|? represents the probability distribution in 
position space, but |y(p)|? does not represent that in momentum space. 


(6) A similar objection can be raised, in a somewhat different form, 
against the statistical interpretations of Bopp [15] and Fenyes [17]. 
These interpretations again adhere entirely at first, as regards physical 
consequences, to the Copenhagen interpretation; they are thus, in the 
positivistic sense, isomorphic with it, as is BoHM’s. However, in the 
language they use, they violate the symmetry between wave and par- 
ticle, which has to be regarded as an essential feature of quantum 
theory since BOHR’s work in 1927 and the investigations of JORDAN, 
KLEIN and WiGNER. Bopp and FENYES consider the particles as objec- 
tive physical realities, more or less in the sense of materialistic ontology, 
but not the (three-dimensional) material waves or radiation waves in 
the formulation of JORDAN, KLEIN and WIGNER. There is, however, no 
reason in the quantum theory to prefer particles to waves or vice versa. 
Bopp considers the appearance or disappearance of a particle as the 
real fundamental process of quantum theory, and he interprets the laws 
of quantum mechanics as a special case of correlation statistics for 


20 W. HEISENBERG 


such events. Such an interpretation, as Bope has shown, can be carried 
out without contradiction, and throws light upon the interesting 
relations between quantum theory and correlation statistics. The 
symmetry between corpuscle and wave, however, could only be ensured 
if the corresponding correlation statistics were developed for three- 
dimensional waves as well, the question being to some extent left 
open whether the particles or the waves are to be considered as the 
‘‘actual’’ reality. Such an extension of Bopp’s ideas has not yet been 
attempted. 


Whereas Bopp otherwise takes expressly the standpoint of the 
ordinary quantum theory, FENyEs considers that large deviations are 
“basically” possible. For example, he says that “‘the existence of the 
uncertainty principle” (which he connects with certain statistical 
relations) ‘““‘by no means renders impossible the simultaneous measure- 
ment, with arbitrary accuracy, of position and velocity.”’ FENYES does 
not, however, state what nature such measurements would have in 
practice, and his considerations therefore seem to remain abstract 
mathematics. 


WEIZEL [18], whose proposals are akin to those of BoHM and of 
FENYES, relates the “‘hidden parameters” to a new kind of particle, the 
“zeron,”’ which is not otherwise observable. Such a concept, however, 
runs into the danger that the interaction between the real particles and 
the zerons dissipates the energy among the many degrees of freedom of 
the zeron field, so that the whole of thermodynamics becomes a chaos. 
WEIZEL has not explained how he proposes to avoid this danger. 
Furthermore, the same objections may be made to his proposals as to 
the other work hitherto discussed. 


(c) The standpoint of the entire group of publications mentioned 
above can perhaps best be defined by recalling the similar discussion of 
the special theory of relativity. Anyone who was dissatisfied with 
EINSTEIN’s negation of absolute space and absolute time could then 
argue somewhat as follows. The non-existence of absolute space and 
absolute time is by no means proved by the special theory of relativity. 
It has been shown only that true space and true time do not occur in 
any ordinary experiment, but if this aspect of the laws of Nature has 
been correctly taken into account, and thus the correct “‘apparent”’ 
times have been introduced for moving co-ordinate systems, there would 
be no arguments against the assumption of an absolute space. It would 
even be plausible to assume that the centre of gravity of our Galaxy is 
(at least approximately) at rest in absolute space. The critic of the special 
theory of relativity might add that we may hope that future measure- 
ments will allow the definition of absolute space (that is, of the “hidden 


The Development of the Interpretation of the Quantum Theory 7) 


parameter” of the theory of relativity), and that the theory of relativity 
will thus be refuted. 


It is seen at once that this argument cannot be refuted by experiment, 
since it as yet makes no assertions which differ from those of the special 
theory of relativity. Such an interpretation of the theory of relativity, 
however, would destroy, at least in the language used, just the decisive 
symmetry property of the theory of relativity, namely the Lorentz 
invariance, and it must therefore be considered inappropriate. 


The analogy to the quantum theory is obvious. The laws of quantum 
theory are such that the “hidden parameters,” invented ad hoc, can 
never be observed. The decisive symmetry properties are thus destroyed 
if we introduce the hidden parameters as a fictitious entity into the 
interpretation of the theory. 


(d) The work of BLOCHINZEW [13] and ALEXANDROW [12] is quite 
different, in its statement of the problem, from those discussed above; 
these authors, expressly and from the first, restrict their objections to 
the philosophical side of the problem. At the physical level they accept 
the Copenhagen interpretation unreservedly. The external form of the 
polemic is so much the sharper: ““Among the different idealistic trends 
in contemporary physics, the so-called Copenhagen school is the most 
reactionary. The present article is devoted to the unmasking of the 
idealistic and agnostic speculations of this school on the basic problems 
of quantum mechanics,” writes BLOCHINZEW [13] in his introduction. 
The acerbity of the polemic shows that here it is a matter not of science 
alone, but of a confession of faith. The aim is expressed at the end with 
a quotation from the work of LENIN: “However marvellous, from the 
point of view of a common human intellect, the transformation of the 
imponderable ether into ponderable matter, however strange the elec- 
tron’s lack of any but electromagnetic mass, however unusual the restric- 
tion of the mechanical laws of motion to but one realm of natural 
phenomena and their subordination to the deeper laws of electro- 
magnetic phenomena, and so on—all this is but another confirmation 
of dialectical materialism.’ Although the hypotheses in the work of 
BLOCHINZEW and ALEXANDROW thus originate outside science, the 
discussion of their arguments is nevertheless very instructive. 


Here, where the task is to rescue materialistic ontology, the attack is 
chiefly made against the introduction of the observer into the interpre- 
tation of the quantum theory. ALEXANDROW [12] writes: ““We must 
therefore understand by ‘result of measurement,’ in the quantum 
theory, only the objective effect of the interaction of the electron with a 
suitable object. Mention of the observer must be avoided, and we must 
treat objective conditions and objective effects. A physical quantity is 


yp) W. HEISENBERG 


an objective characteristic of the phenomenon, but not the result of an 
observation.”” According to ALEXANDROW, the wave function y 
characterizes the “‘objective’’ state of the electron. 


In his presentation, ALEXANDROW overlooks the fact that the 
interaction of a system with a measuring apparatus, if the apparatus 
and the system are regarded as cut off from the rest of the world and 
are treated as a whole according to quantum mechanics, does not as a 
rule lead to a definite result (e.g. the blackening of a photographic plate 
ata given point). If the defence against this reasoning is that “in reality” 
the plate is blackened at a given point after the interaction, the rejoinder 
is that the quantum-mechanical treatment of the closed system electron 
+ plate is no longer being applied. It is the “factual” character of an 
event describable in terms of the concepts of daily life which is not 
automatically contained in the mathematical formalism of the quantum 
theory, and which appears in the Copenhagen interpretation by the 
introduction of the observer. Of course, the introduction of the 
observer must not be misunderstood to imply that some kind of sub- 
jective features are to be brought into the description of Nature. The 
observer has rather only the function of registering decisions, i.e. 
processes in space and time, and it does not matter whether the observer 
is an apparatus or a human being; but the registration, i.e. the transi- 
tion from the possible to the actual, is absolutely necessary here, and 
cannot be omitted from the interpretation of the quantum theory. 
(See, in this connection, the considerations of VON WEIZSACKER [24] 
on the part played by the “actual” in the theory of heat, and the 
remarks of Lupwic [25] on the relations between quantum theory and 
thermodynamics.) It must also be pointed out that in this respect the 
Copenhagen interpretation of quantum theory is in no way positivistic. 
For whereas positivism is based on the sensual perceptions of the 
observer as the elements of reality, the Copenhagen interpretation 
regards things and processes which are describable in terms of classical 
concepts, i.e. the actual, as the foundation of any physical interpretation. 


BLOCHINZEW [13] formulates matters slightly differently from 
ALEXANDROW: “In quantum mechanics we describe, not the state of a 
particle in ‘itself’ but the fact that the particle belongs to this or that 
assembly. This belonging is completely objective, and does not depend 
on statements made by the observer.”’ Such formulations, of course, 
take us very far (probably too far) from materialistic ontology. For, in 
classical thermodynamics for example, things are different. The deter- 
mination of the temperature of a system implies to the observer that the 
system is just one sample out of a canonical ensemble, and thus he may 
consider it as possibly having different energies. “In reality,’ however, 


The Development of the Interpretation of the Quantum Theory 23 
iP B j 


it has only one definite energy at a given time, and none of the others is 
realized; the observer has therefore been deceived if he considered a 
different energy at that moment as possible. There are indeed diffi- 
culties at this point in the quantum theory, with the words “‘in reality”’; 
we shall discuss this below. If, however, a “‘completely objective” 
character is ascribed to a particle’s belonging to a quantum-mechanical 
assembly (especially for a mixture of states), the word ‘‘objective”’ is 
used in a somewhat different sense from classical physics; for, 
“belonging to an assembly” always means in classical physics, at least 
where a past event is concerned, a statement also about the observer’s 
degree of knowledge of the system. Thus one sees: such concepts as 
“objective reality” have no immediately evident meaning, when they 
are applied to the situation which one finds in atomic physics. 


Above all, we see from these formulations how difficult it is when we 
try to push new ideas into an old system of concepts belonging to an 
earlier philosophy, or, to use an old metaphor, when we attempt to put 
new wine into old bottles. Such attempts are always distressing, for 
they mislead us into continually occupying ourselves with the inevitable 
cracks in the old bottles, instead of rejoicing over the new wine. 


(2) Unlike the investigations discussed above, the work of JANossy [19] 
makes its attack on the “orthodox” quantum theory entirely on the 
firm ground of physics. JANossy has realized that the assumption that 
quantum mechanics is rigorously valid compels us to depart from the 
reality concept of classical physics, and he therefore seeks to alter 
quantum mechanics in such a way that, although many of the results 
remain true, its structure approaches that of classical physics. His point 
of attack is what is called “the reduction of wave-packets,”’ i.e. the fact 
that the wave function representing the system changes discontinuously 
when the observer takes cognizance of a result of measurement. JANOSSY 
asserts that this reduction cannot be deduced from SCHRODINGER’Ss 
equation, and he believes that he can conclude from this that there is an 
inconsistency in the “‘orthodox”’ interpretation. It is well known that 
the reduction of wave-packets always appears in the Copenhagen 
theory when the transition is completed from the possible to the actual 
(in the formalism, always for a statistical mixture of states), i.e. the 
actual is selected from the possible, which is done by the “‘observer,”’ 
in the usual nomenclature. The underlying assumption is that the 
interference terms are in the actual experiment removed by the partly 
undefined interactions of the measuring apparatus, with the system and 
with the rest of the world (in the formalism, the interaction produces a 
“mixture’’). JANOSSY now tries to alter quantum mechanics by the 
introduction of damping terms, in such a way that the interference 


24 W. HEISENBERG 


terms disappear of themselves after a finite time. Even if this corres- 
ponded to reality (and there is no reason to suppose this from the 
experiments which have yet been performed), there would still remain a . 
number of alarming consequences of such an interpretation, as JANOSSY 
himself points out (e.g. waves which are propagated faster than the 
velocity of light, interchange of the time sequence of cause and effect 
for moving observers, i.e. distinction of certain co-ordinate systems), so 
that we should hardly be ready to sacrifice the simplicity of quantum 
theory for this kind of view until we are compelled by experiments to 
do so. 


(3) Among the remaining opponents of the ‘“‘orthodox” quantum 
theory, SCHRODINGER [22] takes an exceptional position, inasmuch as 
he would ascribe the “‘objective reality’ not to the particles, but to the 
waves, and is not prepared to interpret the waves as “‘probability waves 
only.” In his work “Are there quantum jumps?” he attempts to deny 
the existence of quantum jumps altogether. Now SCHRODINGER’s work, 
first of all, contains some misunderstandings of the usual interpretation. 
He overlooks the fact that only the waves in configuration space, that is 
the transformation matrices, are probability waves in the usual inter- 
pretation, while the three-dimensional material waves or radiation waves 
are not. The latter, according to BouR and to KLEIN, JORDAN and 
WIGNER, have just as much (and just as little) “objective reality” as 
particles; they have no direct connection with probability waves, but 
have a continuous density of energy and of momentum, like a Maxwell 
field. SCHRODINGER therefore rightly emphasizes that at this point the 
processes can be conceived of as being more continuous than they 
usually are. Of course, SCHRODINGER cannot hereby remove the 
element of discontinuity from the world, which is found everywhere in 
atomic physics (very obviously, for instance, on the scintillation screen). 
In the usual interpretation of quantum theory, it is contained in the 
transition from the possible to the actual. SCHRODINGER himself makes 
no counter-proposal as to how he intends to introduce the element of 
discontinuity, everywhere observable, in a different manner from the 
usual interpretation. 


The criticism of quantum theory which has been expressed at times 
by EINSTEIN [20] and VON LAUE [21] (cf. also the work of RENNINGER 
[23] and others) starts, like the other publications discussed so far, from 
the fear that the quantum theory might deny the existence of an 
objectively real world, and so might cause the world to appear in some 
way (by a misunderstanding of the tenets of idealistic philosophy) as an 
illusion. The physicist must, however, postulate in his science that he 
is studying a world which he himself has not made, and which would be 


The Development of the Interpretation of the Quantum Theory 25 


present, essentially unchanged, if he were not there. Although this 
problem has already received detailed treatment in the literature, we 
shall give here a further analysis, showing to what extent this basis of 
all physics has been maintained in the Copenhagen interpretation of 
quantum theory. 


Ill. We begin by recalling some of the considerations of Gibbs’ thermo- 
dynamics, which Bour has always pointed to as an especially clear 
application of the theory of knowledge in physics. As an example, we 
imagine a piece of metal which can emit electrons in consequence of 
thermal motion (classical mechanics is supposed valid for the electrons); 
let a measuring apparatus be placed in the neighbourhood, which 
registers the emission of an electron with a velocity above a certain 
limit by an irreversible process (e.g. the blackening of a photographic 
plate). Let the apparatus be adjusted to such a threshold velocity of 
the triggering electrons from the metal that their emission takes place 
only infrequently, e.g. at an average interval of some hours. 


The measurement of the temperature of the piece of metal leads to 
an “objective” determination of a property of the metal, which we 
express mathematically by regarding the “metal” system as a sample 
arbitrarily selected from a canonical ensemble. Here “‘objective’’ 
means that any thermometer can be used, provided that it is usable as 
a measuring instrument, and that the results of the temperature 
measurement do not depend on either the measuring instrument or 
the observer. If the metal and the measuring apparatus are entirely 
separate from the rest of the world, this system has also a constant 
energy, whose value is not exactly known, on account of the canonical 
distribution. If, however, the metal is in contact with the external 
world, its energy varies with time and oscillates in temperature equili- 
brium about the mean value, in the manner indicated by the canonical 
distribution. If the canonical ensemble of the mechanical system 
metal + measuring apparatus is followed by Newtonian mechanics, 
this ensemble evolves in the course of time in such a way that it contains 
a continually increasing proportion of states in which the photographic 
plate of the measuring apparatus is blackened (it being assumed 
unexposed at the start of the experiment). The probability that the 
measuring apparatus will respond can hence be calculated, but the exact 
instant cannot be predicted. If every detail of the system were known 
at the start of the experiment, we should have been able to determine 
the instant exactly beforehand, provided that the system was cut off 
from the external world. The statement of the temperature would then 
have been completely meaningless. If, however, the system was con- 
nected with the external world, even a knowledge of every detail of the 


3 


26 W. HEISENBERG 


metal and of the measuring apparatus in the beginning would have 
been of no avail for predicting the result of the experiment, because 
we do not know every detail of the external world. We have till now 
called this entire description ‘‘objective,”’ and have given the reasons 
for doing so. Nevertheless, it also contains a ‘‘subjective’”’ element, as 
we shall now see. Namely, in the absence of an observer, the mathema- 
tical representation of the system would go on changing continuously, 
in the way we have outlined. If, however, the observer is present, he 
will suddenly register the fact that the plate is blackened. The transition 
from the possible to the actual] is thereby completed as far as he is 
concerned; he correspondingly alters the mathematical representation 
discontinuously, and the new ensemble contains only the blackened 
photographic plate. This discontinuous change is naturally not 
contained in the mechanical equations of the system or of the ensemble 
which characterizes the system; it corresponds exactly to the “‘reduc- 
tion of wave-packets” in the quantum theory, as we shall explain below. 
We see from this that the characterization of a system by an ensemble 
not only specifies the properties of this system, but also contains 
information about the extent of the observer’s knowledge of the system. 
To this extent, the use of the word “objective” for the characterization 
of the system by the ensemble is problematical. 


After these preliminary remarks, let us return to quantum mechanics. 
According to the situation, an individual atomic system can be repre- 
sented by a wave function or by a statistical mixture of such functions, 
i.e. by an ensemble (mathematically, by a density matrix). If the system 
interacts with the external world, only the second representation is 
possible, since we do not know the details of the “‘external world”’ 
system. If the system is closed, we may in some circumstances have, at 
least approximately, a “‘pure case,” and the system is then represented 
by a vector in Hilbert space. The representation is, in this particular 
case, completely “‘objective,”’ i.e. it no longer contains features con- 
nected with the observer’s knowledge; but it is also completely abstract 
and incomprehensible, since the various mathematical expressions 
v(q), y(p), etc., do not refer to real space or to a real property; it thus, 
so to speak, contains no physics at all. The representation becomes a 
part of the description of Nature only by being linked to the question 
of how real or possible experiments will result. From this point we 
must take into consideration the interaction of the system with the 
measuring apparatus and use a statistical mixture in the mathematical 
representation of the larger system composed of the system and the 
measuring apparatus. It might appear that this cculd in principle be 
avoided if it were possible to separate the system and the measuring 
apparatus, as a compound system, entirely from the external world. 


The Development of the Interpretation of the Quantum Theory 27 


However, Bour has rightly pointed out on many occasions that the con- 
nection with the external world is one of the necessary conditions for 
the measuring apparatus to perform its function, since the behaviour of 
the measuring apparatus must be capable of being registered as some- 
thing actual, and therefore of being described in terms of simple con- 
cepts, if the apparatus is to be used as a measuring instrument at all, and 
the connection with the external world is therefore necessary. The com- 
pound system of system and measuring apparatus is therefore now 
described mathematically by a mixture, and thus the description con- 
tains, besides its objective features, also the previously discussed state- 
ments about the observer’s knowledge. If the observer later registers 
a certain behaviour of the measuring apparatus as actual, he thereby 
alters the mathematical representation discontinuously, because a 
certain one among the various possibilities has proved to be the real 
one. The discontinuous “reduction of wave-packets,”’ which cannot 
be derived from SCHRODINGER’s equation, is thus, exactly as in Gibbs’ 
thermodynamics, a consequence of the transition from the possible to 
the actual. Of course it is entirely justified to imagine this transition, 
from the possible to the actual, moved to an earlier point of time, for 
the observer himself does not produce the transition; but it cannot be 
moved back to a time when the compound system was still separate 
from the external world, because such an assumption would not be com- 
patible with the validity of quantum mechanics for the closed system. 


We see from this that a system cut off from the external world is 
potential but not actual in character, or, as BouR has often expressed it, 
that the system cannot be described in terms of the classical concepts. 
We may say that the state of the closed system represented by a Hilbert 
vector is indeed objective, but not real, and that the classical idea of 
“objectively real things” must here, to this extent, be abandoned. The 
characterization of a system by its Hilbert vector is complementary to 
its description in terms of classical concepts, in a similar manner to the 
way in which the statement of the microscopic state is complementary 
in Gibbs’ thermodynamics to the statement of the temperature. The 
description of a fact can be effected in terms of classical concepts in just 
the approximation in which classical physics can be used. The 
mathematics of quantum theory can be used for this description as 
well, i.e. the boundary between the object in quantum theory and the 
observer who describes or measures in time and space can be pushed 
further and further in the direction of the observer. In this case the 
measuring apparatus must be characterized as a statistical mixture, 
and account must be taken of the fact that the individual states in such 
a mixture are again altered by interaction with the observer. Know- 
ledge of the ‘‘actual” is thus, from the point of view of the quantum 


28 Wa HEISENBERG 


theory, by its nature always an incomplete knowledge. For the same 
reason, the statistical nature of the laws of microscopic physics cannot 
be avoided. 

The criticism of the Copenhagen interpretation of the quantum 
theory rests quite generally on the anxiety that, with this interpretation, 
the concept of “objective reality” which forms the basis of classical 
physics might be driven out of physics. As we have here exhaustively 
shown, this anxiety is groundless, since the ‘‘actual” plays the same 
decisive part in quantum theory as it does in classical physics. The 
Copenhagen interpretation is indeed based upon the existence of 
processes which can be simply described in terms of space and time, 1.e. 
in terms of classical concepts, and which thus compose our “reality” 
in the proper sense. If we attempt to penetrate behind this reality into 
the details of atomic events, the contours of this ‘‘objectively real” 
world dissolve—not in the mist of a new and yet unclear idea of reality, 
but in the transparent clarity of a mathematics whose laws govern the 
possible and not the actual. It is of course not by chance that “‘objective 
reality” is limited to the realm of what Man can describe simply in 
terms of space and time. At this point we realize the simple fact that 
natural science is not Nature itself but a part of the relation between 
Man and Nature, and therefore is dependent on Man. The idealistic 
argument that certain ideas are a priori ideas, i.e. in particular come 
before all natural science, is here correct. The ontology of materialism 
rested upon the illusion that the kind of existence, the direct ‘‘actuality”’ 
of the world around us, can be extrapolated into the atomic range. 
This extrapolation, however, is impossible. 


Since all counter-proposals hitherto made against the Copenhagen 
interpretation have found themselves compelled to sacrifice essential 
symmetry properties of the quantum theory, we may well suppose that 
the Copenhagen interpretation is unavoidable if these symmetry 
properties, like the Lorentz invariance, are held to be a genuine feature 
of Nature; and every experiment yet performed supports this view. 


REFERENCES 


[1] M. PLANcK; Verhandl. Deutsch. Phys. Ges. 2, 237, 1900 

(2] A. Emsretn; Ann. Physik (4) 17, 132, 1905 

[3] N. Bonr, H. Kramers and J. C. SLATER; Z. Phys. 24, 69, 1924 
[4] W. Borne and H. Geicer; Z. Phys. 33, 639, 1925 

[5] M. Born; Z. Phys. 37, 863, 1926 and 38, 803, 1926 

[6] W. HEISENBERG; Z. Phys. 40, 501, 1926 

[7] E. SCHRGDINGER; Ann. Phys. 79, 361, 489, 734, 1926 

[8] W. HEISENBERG; Z. Phys. 43, 172, 1927 

[9] N. Bonr; Naturwissenschaften, 16, 245, 1928 


The Development of the Interpretation of the Quantum Theory 29 


[10] P. JoRDAN and O. KLEIN; Z. Phys. 45, 751, 1927, and 

___ P. JorDAN and E. WIGNER; Z. Phys. 47, 631, 1928 

[11] N. Bour, in Albert Einstein, Philosopher-Scientist. The Library of Living 
Philosophers, Inc., Vol. 7, p. 199. Evanston 1949 

[12] A. ALExANDROW; Dokl. Akad. Nauk 84, (2), 1952 

[13] D. BLOcHINZEW; Sowjetwissenschaft 6 (4), 1953 

[14] D. Boum; Phys. Rev. 84, 166, 1951 and 85, 180, 1952 

[15] F. Bopp; Z. Naturforsch. 2a (4), 202, 1947; 7a, 82, 1952; 8a, 6, 1953 

[16] L. DE BRoGLie; La physique quantique restera-t-elle indéterministe? Gauthier- 
Villars, Paris 1953 

[17] I. Fenyes; Z. Phys. 132, 81, 1952 

[18] W. WEIZEL; Z. Phys. 134, 264, 1953 and 135, 270, 1953 

[19] L. JANossy; Ann. Physik (6) 11, 324, 1952 

[20] e.g. A. EINSTEIN in Albert Einstein, Philosopher-Scientist. The Library of Living 
Philosophers, Inc., Vol. 7, pp. 665 ff. Evanston 1949 

[21] e.g. M. von Lave; Naturwissenschaften 38, 60, 1951 

[22] E. SCHRGDINGER; Brit. J. Phil. Sci. 3, 109, 233, 1952 

[23] M. RENNINGER; Z. Phys. 136, 251, 1953 

[24] C. F. WeizsAcKer; Ann. Physik (5S) 36, 275, 1939 

{25] G. Lupwic; Z. Phys. 135, 483, 1953 


EXCLUSION PRINGEPIEE, E@ipN Tz 
GROUP AND REFLECTION OE 
SPACE-TIME AND CHARGE 


W. Pauli 


DEDICATION 


THE 70th anniversary of NreLs Bour’s birthday reminds me of a long 
and still continuing common pilgrimage since the year 1922, in which 
so many stations are involved. Without pretension of completeness I 
mention here only some of them in their relation to the particular subject 
of this paper which, I hope, he will permit me to dedicate to him on 
the occasion of this celebration. 


After a brief period of spiritual and human confusion, caused by a 
provisional restriction to ‘“‘Anschaulichkeit,” a general agreement was 
reached following the substitution of abstract mathematical symbols, as 
for instance psi, for concrete pictures. Especially the concrete picture 
of rotation has been replaced by mathematical characteristics of the 
representations of the group of rotations in three dimensional space. 
This group was soon amplified to the Lorentz group in the work of 
Dirac. Fortunately, with his fine instinct for physical realities, he 
started his argument without knowing the end of it: a theory which has 
exact symmetry with respect to the sign of the electric charge, the energy 
of which is always positive, and which predicts the creation and anni- 
hilation of pairs. To reach this end Dirac’s ideas cooperated with the 
exclusion principle in a new and surprising way. 


The mathematical group was further amplified by including the 
reflections of space and time. Already in a rather early stage of the 
theory WIGNER developed interesting consequences for time reversal 
out of the apparently trivial remark that by definition the replacement 
of a function by its conjugate complex is not a “‘linear operator.” Later 
SCHWINGER gave an alternative formulation by showing that one sees 
more if one reads both ways: not only from left to right, but also from 
right to left, a lesson which I modestly try to apply. 


I believe that this paper also illustrates the fact that a rigorous 
mathematical formalism and epistemological analysis are both indis- 


30 


Exclusion Principle, Lorentz Group and Space-time Reflection 31 


pensable in physics in a complementary way in the sense of NIELS 
Bour. While | try to use the former to connect all mentioned features 
of the theory with help of a richer “‘fulness” of plus and minus signs in 
an increasing “‘clarity,” the latter makes me aware that the final 
“truth” on the subject is still ““dwelling in the abyss’’*. 


§ 1. Introduction 


Although I have treated the subject of the connection between spin 
(half-integral or integral) and statistics (antisymmetric or symmetric 
wave-functions, and anticommutators or commutators of the field 
observables used in field quantization) some time ago in several papers 
(see [1] to [5]) there are still reasons to resume and to amend this old 
discussion. One reason is that in the old papers I restricted the argu- 
ments essentially to the case of free particles. Today the investigation 
of interactions between different kinds of particles or fields is much 
more in the centre of general interest and this changes in part the weight 
of the old arguments. For instance, in order to exclude theoretically 
the possibility of the quantization of fields with integral spin with anti- 
commutators, I introduced a particular postulate, that the commutators 
of physical quantities have to vanish in points with space-like distance. 
This has been done in order to get rid of the possibility of making the 
commutator of the field in the absence of force equal to the A®- 
function instead of the A-function. However, this anomalous quantiza- 
tion cannot be extended to a rule consistent with the field equations in 
the case that any local interaction of the particles with each other (non- 
linearity of the field equation) or with other particles is taken into 
account. Hence in this more general case the particular postulate in 
question seems to be superfluous. 


Regarding the case of half-integral spin the original argument of 
Dirac that the energy-values have to be positive which led him to his 
theory of holes, seems to me still the best theoretical a priori argument 
in favour of the validity of the exclusion principle for these spin values. 
It is true that also for the quantization of spinor-fields with anti- 
commutators the positive sign of the energy (relative to the energy of the 
vacuum as zero) is guaranteed for free particles only. If interactions 
are taken into account, in order to avoid infinities, we must apply a 
particular technique of additional rules, the so-called renormalization 
of the constants of mass and charge. Although some progress has been 
made in this way, the success of this method is certainly only a limited 


* T refer here to BourR’s favourite verses of SCHILLER: 


“Nur die Fille fiihrt zur Klarheit 
Und im Abgrund wohnt die Wahrheit.’ 


32 W. PAUL 


one and only for particular interactions. As this method is essentially 
based on the assumption that the theory for free particles (without 
interactions) holds exactly for the so-called one particle states*, the old 
argument of Dirac in his theory of holes is still valid at least for these 
particular states. 


Another reason to resume the discussion is the fact that more recently 
attempts have been made to establish a connection between the pro- 
perties of the fields for reflections of space-time and of charge, or more 
precisely, for particle-antiparticle conjugationt on the one hand and 
spin-statistics on the other hand. Since, for instance, the charge- 
density for spin 4-particles is positive-definite in the c-number theory, 
the idea suggested itself to derive the quantization of this kind of field 
with anticommutators (exclusion principle) from the postulate of charge- 
symmetry. For a single spinor field this was indeed rather easy, but for 
the case of several spin }-fields the situation is more complicated if one 
does not assume a particular kind of transformation for charge- 
conjugation but only assumes the existence of some suitable transforma- 
tion without using the postulate of positive energies (see [4]). In this 
paper we shall therefore assume the normal transformation for particle- 
antiparticle conjugation (AC) which connects every spinor-field with 
its own conjugate complex also in the case of several spinor fields. 
Formally in order to obtain charge-symmetry it is then necessary, 
taking into account the quantization of the field with the help of 
anticommutators, to anti-symmetrize every expression bilinear in the 
field variables in its transcription from the c-number to the g-number 
theory, as was first shown by HEISENBERG [8]. 


Following further along this line of thought progress was made by 
SCHWINGER [9] who gave a simple general expression for the trans- 
formation of different kinds of fields, if simultaneously not only every 
particle transforms into its anti-particle (which includes charge con- 
jugation), but also the space-time coordinates change their sign. This 
I shall call in the following the strong reflection and abbreviate it by SR. 
It is an essential part of the transformation given by SCHWINGER that 
in the expression for all operators the order of factors has to be reversed, 
in other words all formulas have to be read from right to left instead of 
from left to right. I shall denote this part of the SR in the following 
briefly with the word “‘inversion.’’ The particular transformation rule 
given by SCHWINGER for SR only holds if the half-integral spin fields 


* A formulation of the renormalization technique without reference to perturbation 
methods is given by G. KALLEN [6]. An interesting non-relativistic example is discussed 
by Lex [7]. 

{t Abbreviated hereafter as AC. 


Exclusion Principle, Lorentz Group and Space-time Reflection 30 


are quantized with anticommutators, the integral spin-fields with 
commutators*. 


By combining the SR with the AC one gets another transformation, 
for which space-time is reflected, but the electric charge preserves its 
sign and no change of particle into antiparticle takes place. This I call 
the weak reflection and denote it by WR. The WR has been formulated 
already in the non-relativistic quantum mechanics by WIGNER [10] and 
has then been generalized to the relativistic case (see for instance [11]). 
It can easily be seen (see WATANABE [12]) that the WR holds, irrespective 
of the commutation rules for the fields, so that unlike the case of SR or 
AC no conclusions can be drawn from it regarding statistics. 


In § 2 of this paper I give the formulas for WR and SR for the sim- 
plest cases of spin 0, spin $ and spin 1 fields without claiming to say 
anything newt. The formulas for WR I give in a form slightly different 
from that of the group of collaborators in Princeton but entirely equiva- 
lent to their formulation. I am restricting myself, however, in this paper 
to the discussion of the reflection of all coordinates simultaneously, 
while I do not consider the reflection of space or time separately. This 
has the advantage that I do not need to assume that the interaction 
energy is invariant with respect to this separate reflection. 


At first glance it seems to be a matter of pure convenience whether one 
starts with SR, AC or WR, each one of these three transformations being 
the product of the other two. But it was realized by LUpgrs in a paper 
written in Copenhagen [13] that both WR and AC lead to the same 
additional restrictions for the Hamiltonianst and he reached theimportant 
conclusion that the SR follows from more general postulates than 
either of the other two (WR or AC). His proof, however, is not easy 
to understand, because it uses unnecessary assumptions. 


It is the main purpose of this paper to give a proof for the validity of 
the SR under more general premises by a combination of a transforma- 
tion given earlier by the author [2] and of SCHWINGER’s technique of 
“inversion’’ (§ 3). We do not try to derive here, from the existence of an 


* The reader of ScHwiNGER’s paper easily gets the wrong impression that every trans- 
formation which reverses the sign of the space-time coordinates necessarily reverses the 
sign of the electric charge too. This is obviously not the case. A careful reading of the 
paper shows, however, that it does not permit one to draw certain conclusions about 
SCHWINGER’S own opinion on this point. 

+ To my surprise Dr. G. Lipers informed me that equation (21a) is not contained in 
the published literature. At that same time he was so kind to propose slight improvements, 
which have been incorporated in the text. Meanwhile a paper appeared by H. UMEZAwa, 
S. Kamerucui, and S. Tanaka, Progs. Theor. Phys. 12, 383, 1954, where also this and 
similar examples are treated. 

+ This was shown for the coupling between one Boson field and one Dirac field ([{14], 
[15], [16]) and for the Fermi coupling of four Dirac fields ([{17], [18]). Compare below § 2. 


34 W. PAuLt. 


SR alone, the connection between spin and statistics, but we assume 
here this connection to hold (A). Regarding special relativity we only 
assume the invariance of the Lagrangian with respect to proper Lorentz- 
transformations (continuous Lorentz group) abbreviated by L, (B). 
We assume further for the sake of simplicity the /ocal character of the 
field equation, which means that all field quantities are spinors or tensors 
of finite rank and that the interaction part of the Lagrangian (or the 
Hamiltonian) contains only derivatives of finite order of these field 
quantities (C). (Possible generalizations of these assumptions will be 
briefly indicated at the end of §3.) It is convenient to assume* that 
‘kinematically independent spinor fields anticommute” (D). It would 
also be possible to admit groups of spinors commuting with each other, 
but it seems mathematically artificial to forbid the occurrence of linear 
combinations between them. In order to apply assumptions (A) and 
(B) it is convenient to decompose the field quantities into parts which 
transform with respect to proper Lorentz-transformations according to 
irreducible representations (assumed to be of finite order). Each of 


them is conventionally characterized by a pair of indices ic 4 with 


pe? 
integers m,n. If the quantity in question is written as a spinor, which is 
symmetrical in each of the two groups of indices, the numbers of 
undotted and dotted indices are m and n respectively. If m+ nis odd 
(even) we say that the field in question is a Fermion (Boson) field. 


Using this definition we can now define the “‘ordering of products” 
as followsT: 


‘Each product of M Boson fields and N Fermion fields is to be re- 
placed by the sum, divided by (M -— N)!, of all permutations of the 
factors, each of the terms being multiplied by +- 1 or — I for an even 
or odd permutation of the Fermion fields, respectively.”” Here assump- 
tion (A) is implicitly used, which states that for Fermion and Boson 
fields only anticommutators and commutators respectively are equal to 
c-numbers. 


The main difference between our premises and those of LUDERS is 
that we use neither the assumption that the theory is invariant with 
respect to space or time reflection separately, nor that the Hamiltonian 
is Hermitian}. Further the independence of the argument on the gauge 
group has been made more obvious. This generalization means, of 
course, that a charge integral does not necessarily exist. Nevertheless 

* Compare Lupers [13], postulate Ia. 

t Compare LUpERS [13], postulate IIa, which is slightly generalized here. 

; The particular kind of connection of the two dual spaces of state vectors over the 


complex conjugate is necessary, but for other physical reasons than the existence of an 
SR. Compare SCHWINGER’s transposed operators [9], p. 925 f. 


Exclusion Principle, Lorentz Group and Space-time Reflection 35 


one can derive a generalized SR for which every ordered local vector 
changes its sign (like the space-time coordinate vector) and every 
ordered local tensor of the second rank or scalar stays unchanged. 


For the proof I return to the use of a transformation, given in [2], 
which uses a division of all spinors and tensors into four classes 
according to whether the numbers m and n, introduced above, are even 
or odd. The transformation in question can then be formulated as 
follows: For SR the fields have to be multiplied by 


(— i) (— 1)” = i(— 1)" for m + n odd (Fermions) 


(— 1)" = (— 1)" for m + n even (Bosons) 


(T) 


For vectors (m7 = n = 1) this means reversal of the sign, which also 
holds for the space-time coordinates. 


The theorem which will be proved in § 3 is this*: If(T) holds for the 
original field quantities and for the space-time coordinates, it also holds 
for all ordered products of them or their derivatives of finite order after 
application of the inversion. 

It can easily be shown that all equations which are covariant under 
proper Lorentz-transformations remain valid after application of (7), 
because invariant operations do not change the parity of m or n for a 
product of spinors. 


For the consistency of the transformation (7) it is of decisive impor- 
tance that, owing to the factor i applied in the first line of (7), all 
invariant reality conditions are preserved under (7). 

This is characteristic for the g-number theory, while in the c-number 
theory reality restrictions for spinors are not preserved for space-time 
reflectionsf. 


It is remarkable, however, that in the g-number theory the SR appears 
as a “gift,” if one only assumes the continuous L, (without gauge 
group) and the connection between spin and statistics without further 
restrictions on the Lagrangianstf. 


§2. (a) The weak reflection (WR) 


In the following we consider Lorentz-invariant quantized field theories 
including interactions between different fields, for which we shall 
discuss simple examples. We shall always assume here that half- 
integral spins are quantized with anticommutators (exclusion principle) 


* Compare also SCHWINGER [9] note 8, p. 925. 

{ This circumstance is closely connected with FEYNMAN’s formal method of quantizing 
spinor fields according to Bose statistics with the help of “negative probabilities” {see [5]). 

+ J have no definite opinion as to whether or not the generality of this result was 
already known to SCHWINGER. 


36 . W. PAuLt 


and integral spin with commutators (Bose statistics) without deriving 
it from other postulates. ‘The imaginary fourth coordinate is used with 
x, it, Greek indices running from I to 4, and the usual units 
hc 1. Spinor indices are not written ont but in order to distin- 
guish spinors from scalars, the former will be denoted by y, in case of 
several spinor ficlds distinguished by indices, the latter by ¢ or D. The 
electromagnetic potential we denote by ¢,, the corresponding ficld 
strengths, given by the curl of it, by f,, = — fix 
In this paper we are only dealing with simultancous reflections of all 
space-time coordinates, 
x = — Xys (1) 


while the separate reflections of space or time alone are not discussed. 
The reason for this will become more obvious in § 3. Consequently we 
do not use here the prefix “pseudo” which remains reserved for the 
space reflections. Quantities which transform in the same way for all 
three kinds of reflections here considered (WR, AC and SR) are in 
general denoted by the same letter, for instance #, can also mean a 
vector meson field which transforms like the electromagnetic field with 
respect to these threc reflections. Moreover we shall introduce another 
kind of scalar and veetor ficld, denoted by @ and ‘P,, which transform 
differently from ¢ and ¢, for the reflections WR and AC. 

We start now with the discussion of the weak reflection which by 
definition leaves the operator 


Q = J(— fsx) Jax (2) 
of the electric charge unchanged: 
Q — 2. (3) 


As we restrict ourselves here to Jocal transformation of fields which 
connect only the fields at the same space-time point, we generalize (3) 
to the postulate 


Jikx) = j,{— x). (4) 
From MAXWELL’s equation and (4) it follows for the field strengths 
=f,» (5) 


and (apart from a possible additional gauge transformation) for the 


potentials 
P(x) = $,(— x). (6) 


We shall generalize here the electromagnetic concept of charge to a 
distinction between particle and anti-particle which may also include 
neutrons and neutrinos. The operator Q has then the more general 


Exclusion Principle, Lorentz Group and Space-time Reflection eM 


meaning of the difference between the numbers of particles and anti- 
particles which is an integral of the field equation owing to a generalized 
gauge group. This group distinguishes two kinds of complex (non- 
Hermitian) fields denoted with and without a star which are multiplied 
by gauge-transformations with opposite phase factors e’ and e~” 
respectively. The commutators of these two kinds of fields with Q 
have correspondingly an opposite sign 


The f’s are not yet specified hereby, they can be spinors, scalars or 
vectors, etc., describing particles distinguishable from their anti- 
particles. If the field f belongs to the one class, the Hermitian con- 
jugate field f* belongs to the other class. 


Independently of the gauge group there exist the energy-momentum 
operators P,, (the fourth component P, being the energy multiplied by /) 
which are connected with the translation group x, = x, + 4, for 
space-time and which fulfil the commutation rule 


Pa) 
i[P..f] = ~ Ox 


a 


(8) 
for all field variables f. 


For space-time reflections (1) the right-hand side of this equation 
changes its sign relative to the factor f on the left-hand side, whatever 
the transformation law of f may be. However, it is not possible to 
change the sign of P,, as the sign of the energy (with the vacuum as zero) 
has to stay positive for physical reasons. Moreover, for scalar and 
vector fields the energy is a positive definite quadratic form which can 
also formally never change its sign. We therefore claim the invariance 


P= P, (9) 


which for our local transformations can be generalized to the invariance 
of the energy-momentum density and the Lagrangian-density: 


Te) mae ies x) L(x) ar K— x) : (9a) 


for all three kinds of reflections here considered. 


In order to obtain the desired change of the sign of the left-hand side 
of (8) it is necessary to allow non-linear operations to be applied to the 
state vector of the system. WIGNER chose for it the transition to the 
conjugate complex (to be distinguished from the Hermitian conjugate) 
of the state vector and of all operators. SCHWINGER chose the inversion 
defined by the rule that all operator relations have to read from right 


38 W. PauLi 


to left instead of from left to right. The two procedures are equivalent 
as all physical observables are represented by Hermitian operators. 
For merely formal reasons we follow here the rule of SCHWINGER in 
order to avoid the necessity of introducing the conjugate complex in 
addition to the Hermitian conjugate (it is the latter which is here 
denoted by a star) and further because we do not discuss here explicitly 
the state vectors but restrict ourselves to the discussion of the field 
operators (Heisenberg-representation). Hence we introduce here the 
rules: 

The inversion (reversal of the order of all factors) has to be an essential 
part of the transformation of the fields, if time is reversed, that means 
both for WR and for SR. 

Considering now the commutation rule (7) for Q we have to distin- 
guish WR, for which dy definition Q retains its sign (see (3) ), from SR 
where the sign of Q is reversed. Taking into account that for the 
inversion the commutators in (7) already produce a change of sign we 
reach the conclusion: 

For WR any f has always to be transformed into a f* and vice versa, 
while for SR any f has to be transformed into another f (and any f* into 
another f *). 

More specifically one has to put in the case of WR, which we discuss 
first, for a complex scalar field (charged scalar meson) 


o(~=$*—x), bo) =4¢(—) (10) 

and for a spinor field w(x) and its adjoint 
PX) = p*(x)va (11) 
yx) = QO *G— x), PX) = y(— XQ. (12) 


In the latter equations the matrix Q transforms the Hermitian Dirac 
matrices y, which satisfy 


Vay + Ven = 28» (13) 
into the transposed matrices (y,,)” according to 
(yy = Qy,Q7. (14) 
The unitary matrix (Q satisfies* 
OO*% = 1, “eer. (15) 


One should bear in mind that in the transformation here indicated an 
additional gauge transformation with a constant phase (which includes, 


* See [1], where was denoted by B. 


Exclusion Principle, Lorentz Group and Space-time Reflection 22 


for instance, an additional change of sign in (10) ) is still free. This 
becomes important if interactions are considered by which the particles 
in question can be emitted or absorbed and if several independent fields 
are considered. One checks easily that with this transformation and the 
following inversion the Klein—Gordon equation for scalars and the 
Dirac equations for spinors are preserved. Using (5) one checks 
further that this also holds in an external electro-magnetic field. The 
invariance of the current vectors (see (4) ) 


age 3 
jAx) = ie (x $—¢* s*) (16) 
and 
JAX) = 1¢ Px)y yx) (17) 


respectively is fulfilled after inversion is performed. The latter trans- 
forms for instance y(— x) (y,)"p(— x) into #(— x)y,%p(— x). 

Moreover the invariance of energy-momentum density and Lagran- 
gian density (see (9) and (9a) ) holds for these spin 0 and spin $ particles 
in an external electromagnetic field. 

By decomposing the field into plane waves it can be shown that the 
transformation (10) or (12) can be interpreted as replacing every 
emission operator by an absorption operator of the same eigen- 
vibration without interchange of particle and anti-particle. 

The form of the WR here given makes it obvious that the commutation 
rules for the fields do not enter here. The general reason is that the sub- 
stitution of an f* for an fin a product, let us say, of the form /*g brings 
it first to the form gf* while the following inversion re-establishes the 
original order f*g, so that finally the order of factors is preserved. 
Therefore this WR transformation holds, whether or not the normal 
connection of spin 4 with exclusion principle and of spin 0 with Bose- 
Statistics is assumed (see [12}). 

For the discussion of further examples it is useful to give here besides 
the current also the transformation of the five other covariant bilinear 
forms of the spinor y and its adjoint y with the help of the matrices 


Ys = VAY Bhan (18) 
and 


Yo = Puy — Yo): (19) 
These are determined by the signs in the equations (see (14) ) 
(y5)" = Qy 27, Yfioy = — Qty Q 


i ke 1 T —1 (14a) 
[ysvul? = — Qysy,QO7,  [ysYtol? = — Qy stu 2. 


40 W. Pau 


In this way one obtains (besides the change of the argument x into — x) 
for WR: 


the ++ sign for: ey, iy, Wysy, (20) 
the — sign for: PYsVuP> WY uniP> PYsY{wP- 


The factors i are added here in such a way that owing to (11) the given 
expressions become Hermitian. 


We discuss now as an example the interaction of one Boson field 
with a Dirac-spinor field. First we assume the former to be a real 
(Hermitian) field, corresponding to neutral particles, either a scalar or a 
vector. The invariance with respect to proper Lorentz transformations 
gives the following possibilities for the Lagrangian density of the 
interaction 


Opy; Dlipysy) 


2) a= 
= (ipy,~) + Hc; $,(épy 9) 
iG 


= ip . —_ 

Ox, (ipysy up) + H.c.; O,(ipysy,p) sf 
2 _ 2a) ig 

Gs Bat ae a siete 
og, ats) : 

= <_ ox, (PY 5V tur): 


Here “+ H.c.” means that the Hermitian conjugate expression, which 
is necessary because of the derivatives, must be added. 


For WR the Boson field has to be transformed in such a way that the 
Lagrangian density (21) stays invariant (see (9a) ). It turns out then 
that one must transform the fields written with a capital letter differently 
from the fields with a small letter, namely 


$ (x) = —G(— x), u(x) = ,(— >), (22a) 

Ox) = + Ox), Ox) = — O,(— x). (2b) 
The notation is chosen in such a way that ¢,, transforms like the electro- 
magnetic potential (see (6) ) and like va and that ®,, transforms like 
o® M 


Ox, 


Exclusion Principle, Lorentz Group and Space-time Reflection 41 


Terms belonging to different kinds of fields cannot be mixed, for 
instance the combinations 


) 
CPP) + Ce = (ipy,p) 
7) 
or 


- of, bu) 
CP li Pysyup) + Ce ( im 2) (P57 tun?) 
eG 


are both forbidden, as is already known* (see [14], [15], [16]). 


We consider now the analogous but somewhat more general case of a 
charged Boson pi ss by a complex scalar field ¢ in interaction with 
one charged spin 3 particle, for instance a proton described by a Dirac- 
spinor pp and another neutral spin 4-particle, for instance a neutron, 
whose Dirac spinor field we denote by yy. The result is somewhat 
different as here all combinations of interaction energies can occur but 
with additional reality restrictions for the coefficients. 


We write here the Lagrangian density of the interaction in the forms 


C,d(Gpyy) + Cr d*(Pnvp) 
+ CbCPpysyy) + Ct $* (ips) 


oh .._ Ps) 
+ iC; = (ipy,py) — iCt oe (iPyY pp) + H.c. 
OXp on 


2 og* 
+C, = (PpysyuPn) + Cy . (iPyysy,~p) + H.c. 


a C3h,AiPpy Pn) Ga Fbn(iPyy Wp) 
+ iCeb,\iPpys.Py) — iCF niPyYsY up) 


o¢, 
r nes 2 {CPpytwyPn) + CoPpys?twryn)} + Hc. 


opy a 
= - {C7 GPnYtwiPe) + CFE 7s ~ovp)} + Hc. (21a) 


Before discussing the reflection, we notice that the multiplication 
with different phase factors of the two spinor fields and of the two 
Boson fields is equivalent to the multiplication of Cy, . . ., Cy and of 
C;, . . -» Cg, respectively, by common phase factors, which are therefore 
only conventional. Apart from this the essential result of the trans- 


* We do not discuss here the additional restrictions due to the space-reflection alone. 
They are not sufficient to exclude the forbidden combinations indicated in the text. 


4 


42 W. PAuL 


formation (12) of the two spinor fields and the subsequent inversion is 
here to substitute for every expression p ,Owp,, its conjugate ~,Oy 4, with 
a sign given by (20). For the complex Boson field it is here sufficient to 
apply the transformation (10) amended by a similar transformation for 
the vector field, namely 


p(x) = 6*(— x), d*'(x) = &(— x); 
bAx) = Pi(— x), o,7°(x) = 6,€— ). 


We have taken care of the signs (20) by properly inserting the factor i 
in the third and fifth line of (21a). We obtain then the simple condition: 
the coefficients Cy, . . ., C; must be real, apart from two trivial phase 
factors which can be normalized to unity*. : 

Very similar to this is the example of the Fermi coupling of four 
Dirac fields describing protons wp, neutrons py, electrons y, and 
neutrinos y,. The general possibility for the interaction energy (or 
Lagrangian) density without derivatives is here 


(10a) 


Ci(Pppy) Pen) + CKGnypp) Gn) 
+ CAPpY MNP un) + Cy Pe PnV Pe) 
+ CSPpYoP NG ton) + C3 Py vor GnY uotPe) 
+ Ci PpysPy)PeYs¥n) + or (PyysPp(Gnrv so) 
+ CSPeYsY LP Pes un) + CEP rsV Pe PnVsY Ve) 
+ Cei(Ppvsyn(PePn) + Ce PNY sPPPnPe) 
+ CriPpyy)(Pe¥sYn) + CF (Pv YPr) PnYsYe) 
+ {CSPpysy Pn PoP Pn) — COPnYsr PP) Pn Wed} 
+ i{CoPpypPnPe¥sV Pn) — CEPnV MPM PnVsy uPe)} 
+ Cyl Persyto PN Pe wiPn) + ChiPyvs/oiPP\GFnr¥ wie)» (23) 


Here too the multiplication of the four spinor fields with different 
constant phase factors is equivalent to the multiplication of C,, . . ., 
Cy9 with a common phase-factor which is therefore only conventional. 
Such phase-factors are also arbitrary in the matrices Q defined by (14), 
(15) for the four spinor-fields, which again will give rise to a possible 
common phase-factor of C,, . . ., Cy) after the WR transformation. 


* Here again we do not discuss additional restrictions due to space reflection alone. 
The latter makes it necessary to separate the ‘“‘pseudo”’ terms which contain explicitly the 
Ys from the others. 


Exclusion Principle, Lorentz Group and Space-time Reflection 43 


Apart from this the essential result of the transformation (12) and the 
subsequent inversion applied to the four spinor fields is here again to 
substitute for every expression } ,Owp its conjugate ~pOyp,4. The signs 
given by (20) cancel except for the coefficients Cg and Cy for which 
again a factor i has been added. Then we get the condition that the 
coefficients Cy, . . ., Cy must be real (apart from a trivial common 
phase-factor which can be normalized to unity). This result is also well 
known (see [17], [18])*. Summarizing we see: 


(1) The transformation law of a quantity with respect to proper 
Lorentz-transformations does not determine uniquely its behaviour for 
WR. The latter also depends on the assumed interaction energy. 


(2) The invariance with respect to WR imposes further restrictions 
upon the Lagrangian density of the interaction besides its invariance for 
proper Lorentz transformations. 


The discussion of the SR will show us now that just the contrary is 
true in this other case: No further restrictions are necessary besides the 
proper Lorentz invariance to guarantee invariance for SR and the trans- 
formation of a quantity for SR follows uniquely from its spinor or tensor 
character. A general proof for this will be given in § 3. 


As SR is the product of WR and AC this statement is equivalent to 
the other that the same additional restrictions are imposed upon the 
interaction Lagrangian densities by AC as by WR and that the trans- 
formation of a certain kind of tensor or spinor for WR uniquely 
determines its transformation for AC. (In the literature the latter 
confirmation has been made before the former discussion in various 
cases.) We could, of course, derive the SR as the product of WR and 
AC, the latter being well known, but we prefer to treat the former here 
directly. 


§2. (b) The strong reflection (SR) 


The invariance (9), (9a) for energy-momentum density and Lagrangian 
density is here equally valid as in the WR and here, too, the inversion 
defined above is an essential part of the transformation. However, by 
definition, the “‘charge’’ operator Q and the corresponding current 
j,Ax) have to change their sign: 


f of . 
QG=-Q, j,x)= —j{—2). (24) 
* The invariance for space reflections alone not considered here would exclude the 
coexistence of the interactions with the coefficients C, ...C,;and Cg... . Co. 


A. S. WIGHTMAN and L. MicHeL kindly inform me that they are preparing a paper 
where the questions of invariance for space reflections for the interactions describing the 
B-decay will be treated in detail. 


44 W. PAvLr 


From this follow just the opposite signs as in (5), (6) for the electro- 
magnetic field quantities: 


$x) = —$4—X), furl) = + fio(— ). (25) 


From (7) it follows that no transition to the complex conjugate operators 
will take place here. 

For scalars and spinors we have according to SCHWINGER the simple 
law (again apart from a possible gauge transformation with a constant 
phase factor) 


$' (x) = o(— x), p *(x) = $*(— x) for scalars (26) 
v(x) = y5~(— x), B(x) = —P(—x)y; forspinors (27) 


The minus sign in the last formula has its origin in the anti-com- 
mutativity of y, with the y, occurring in 


PX) = p*O)4 (11) 


so that the latter equation also holds after the transformation. The 
arbitrary phase-factor, still open besides (27), implies the particular 
possibility of replacing (27) by 


YO) =wl—x), P(X) = —(— )9(— *)ys. (27a) 


While the operation (27) performed twice gives the identity, the other 
possibility (27a) performed twice gives a change of the sign of all 
spinors. I mention here only briefly that for the Majorana theory 
(where particles and anti-particles are made identical) only the second 
possibility (27a) is admissible. The matrix y, is necessary to preserve 
the validity of the Dirac equation. 

The physical sense of the transformation (27) is the substitution of 
every emission (absorption) operator by the corresponding absorption 
(emission) operator of the anti-particle. 

Without quantization the transformation (27) would give invariance 
of the current rather than a change of its sign. For the effect of the 
inversion, however, the commutation laws are here essential. This is 
different from WR but the same holds for AC, too. 

In order to guarantee the change of the sign of the current, indicated 
in (24), one has to assume for spinors the quantization by anticommu- 
tators, for scalars by commutators. This statement has to be understood 
in such a way that the particular transformation (26) or (27), followed 
by an inversion is its premise. Moreover one has then to write the 
current (17) for spinors as an antisymmetrized product 


i AX) = ie HPC) %) — yx)(y,)* P(x)} (17a) 


Exclusion Principle, Lorentz Group and Space-time Reflection 45 


so that the inversion just changes the sign. For the current of scalars one 
has to proceed in an analogous way with symmetrization*. Jf this 
additional rule is applied the transformations (26), (27) followed by 
inversion fulfil all requirements for SR. 


Passing to the more general bilinear forms discussed earlier (see 
eq. (20) ) the resulting sign is now different: it is + or — according as 
Ys commutes or anti-commutes with the matrix between @ and y. This 
gives the following table of signs (for the last column compare [16)). 


} 
| 
| 


WR SR AC 
(py), iPysy) ae ae +}. 
My py te ae ee 
ipysyuY pe mes = a 
PY uP — PYsY (poy? a ae _ 


In every horizontal line each of the three signs is the product of the two 
others. For SR the matrix y; has no influence on the signt, the vectors 
and skewsymmetric tensors transform just like the analogous electro- 
magnetic quantities (see (24) and (25) respectively) and the scalars are 
invariant. The first column of the table contains abbreviations in as 
much as each yOy should be properly replaced by its antisymmetrization 


2(POp — yO*#). 
Investigating the interaction densities (21), (21a) and (23) we see that 
after antisymmetrization of the products they are automatically 
invariant for SR, if all scalars and skewsymmetric tensors are invariants 


and all vectors change their sign. This actually holds generally if it is 
assumed for all original Boson fields, as in an expression of the form 


3(P40Vp — YpO* Pa) 
the indices A and B are not exchanged in this case. Therefore, the 
invariance for SR holds in these examples for arbitrary phases of all 
complex coefficients. 
* In order to avoid ambiguities due to infinities the commutator [A(x), B(x)] of two 
field quantities should be defined as 
an (1A — €), Bx + 6) + [AW + 6), Bo — 4) 


+ The concept ‘‘pseudo” does not exist here, but only for separate spatial or temporal 
reflections. 


46 W. PAULI 


§2. (c) Particle-Anti-particle conjugation (AC) 
Combining these results for SR with the earlier results for WR, one 
gets the following simple rules for AC (which one could also have 
derived directly): 
For spinors 
x) = CA) (x) =e. 
v(x) BX), 9X) = ¥%) | ao 
C=Oy, “CC =1, Cl Cy, — — Cy ea. 


for real Boson fields with small letters in (21) (including electro- 
magnetic potentials) change of sign, for capital letters invariance: 


$'(x) = Hx), P(x) = + OX); | (AC) 
$(x) = —$,(x), DX) = + 9,0); 
for the complex Boson fields in (21a) 
P(x) = (+ x), %() =46(+ 3 | (AC) 
G(X) = —Fn(+ x), $e) = — $+ >). 


The same reality restrictions for the coefficients are essential for AC as 
for WR. For AC the commutation laws of the fields are just as essential 
as for SR. 


Concluding the discussion of particular examples*, we pass now to 
the proof of our general statement on SR. 


§ 3. General proof of an SR as a consequence of the continuous Lorentz 
group and of the spin-statistics connection 


We assume here that the theory is Jocal, which means that only 
derivatives with respect to the coordinates of a finite order occur and 
that all field quantities transform with respect to the continuous 
Lorentz group (abbreviated L,), in which no reflections are included, 
according to representations of finite degree (spinors or tensors). The 
irreducible representations of ZL, are usually characterized by two 
numbers, for which we choose the integers m, n equal to twice the con- 
ventional quantum numbers, so that the degree of this representation 
is (m+ 1) (n+ 1). The decomposition of a direct product of two 
irreducible representations into irreducible parts gives always numbers 
m of the same parity and numbers n of the same parity, in other words 


* For the isotopic spin formalism regarding AC we refer to a very general discussion 
by MicuHeL [20], regarding WR to EisENBuD and WIGNER [19]. 
For SR the isotopic spin does not play any role in the transformations. 


Exclusion Principle, Lorentz Group and Space-time Reflection 47 


the different numbers 7 or 7 so obtained differ by an even integer. This 
can easily be seen, for instance, in the spinor calculus which introduces 
two groups of indices, the dotted and the undotted ones with only two 
values for each index. The irreducible spinors are symmetric in every 
group of indices and m and n can then be identified with the numbers of 
indices in these two groups respectively. The only invariant operations 
are the direct product of two quantities and the contraction with the 
skewsymmetric tensor €,. = — €,; = 1 which reduces the number of 
indices of one group by two. This suggests a division of all spinors 
into four classes defined by the parity of the integers m and n [2]. 

A Dirac spinor with four components can be decomposed into two 
irreducible parts, with two components each, characterized by (1,0) 
and (0,1) corresponding to the diagonalization of the matrix y, with its 
two eigen values + 1 and — 1. The transformation 


y’ = sy (27) 


which we applied to Dirac spinors yields then for the irreducible 
quantities u(1,0) and v(0,1) the simple form 


u’(1,0) = u(1,0),  v'(0,1) = — v(0,1). (27 bis) 


The simplest generalization of it for a quantity of the type (n,m), 
denoted by u(n,m), would be the rule 


u'(n,m) = (— 1)u(n,m). (28) 


Indeed, (— 1)” only depends on the parity of m; that means it is a 
character of the class and in the multiplication of any two quantities 
u(n’,m’) and u(n",m") the characters are multiplied too. Hence every 
equation which is invariant with respect to the continuous Ly remains 
valid if every covariant quantity is multiplied by (— 1)”. As a vector 
corresponds to n = m = 1, the transformation (28) changes the sign 
of every vector* and it seems to be suitable as a general rule for SR. 

There is, however, an important qualification to the last italicized 
statement, namely: provided that no reality conditions are used. As the 
dotted and the undotted spinor indices transform according to complex 
conjugate representations, this also holds for two quantities u(n,m) and 
v(m,n). Invariant reality conditions have therefore the form 


(u(n,m) )* = v(m,n). (29) 


It is obvious that such a condition is not generally preserved under 
the transformation (28), as hereby the left-hand side is multiplied by 


* The self-dual tensor with (n,m) = (2,0) or (0,2), and the symmetric tensor with trace 
zero (2,2) remains invariant like the scalar (0,0). 


48 W. PauLi 


(— 1)”, the right-hand side by (— 1)” which only agrees if m and n have 
the same parity (m + n even). 

This can also be verified by using Dirac spinors, because a general 
application of (28) also implies* 


P(x) = + W(— x)y5, YO) = vsp(— 9) (27F) 


while the relation 
P= vy (11) 


which is a particular case of (29) leads to 


P(x) = — W— x)ys (27) 
as wasa_ ady indicated. 


The transformation (27F) is indeed the correct rule for SR in Feyn- 
man’s theory which quantizes the Dirac spinors according to Bose 
statistics by abandoning the reality conditions. In this theory p* is not 
any longer the Hermitian conjugate of y but the “self-adjoint” with 
respect to an indefinite metric in the Hilbert space (“‘negative proba- 
bilities’’) [5]. Then it is no longer a contradiction to assume y’ = y;y 
and y'* = — p*ys. 

This mathematical possibility, however, has no connection with 
physics for which the reality conditions are essential. Therefore I 
proposed in 1940 [2] the transformation 


u'(1,0) = iu(1,0), v’(0,1) = — iv(0,1) (30) 


* To show this we apply the spinor calculus according to which 
w=, w= — my (R) 


transform contragradiently (inversely) to (uj, u,). Use is here made of the fact that the 
determinant of the transformation has the value 1. Moreover 2, v, transform as complex 


conjugate to uy, up. 
mye ink ie I 
TEV 902 Ap) ea 


For the representation 
with two row matrices in each place, the four components of y and ® can then be written 
y ~ (uy, Us5 pi, v®), yp ~ (2, TES Dy, D,). 
The relation (11), namely ~ = y*y,, then gives 
@ = (vd)*, = (v2)*;> By = wy*, Dy = uy* 


which is of the form (29). The real invariants yy and (— i) (#y,y) are then twice the real 
and imaginary part of 
u,(v})* + u,(v2)* = Hyv,* — Ugv,*, 


The second form follows from the rule (R) for the raising of the spinor indices. As 
v,*, v.* transform like v,, v, the invariance of the final expression is obvious. 


Exclusion Principle, Lorentz Group and Space-time Reflection 49 
which preserves invariant reality conditions of the form 
n0.1) —GA0))*. 
This is equivalent to the alternative (27a) given before, namely 
v(x) = tysy(— x), 9) = i9(— x)ys- (27a) 


This is now consistent from the standpoint of the L, alone (without use 
of the gauge-group) as, for y, diagonal, y and # transform here in the 
same way*. 


The proper generalization of (30) is obviously 


u'(n,m) = (— i) (— 1)"u(aym) = i(— 1)"u(n,m) for n + m odd G1) 

u'(n,m) = (— 1)"u(n,m) = (— 1)"u(n,m) for n + m even 
which is in agreement with the general reality condition (29). The 
first line follows from (30) for all transformations which only depend 
on the class of the quantities, and the second line is necessary for SR, 
as vectors must change their sign, while scalars must remain invariant. 

No difficulty arises for a product of an arbitrary number of factors 
with n + m even, which we may call Boson fields in contrast to Fermion 
fields with m + n odd. Nor is there anything new for a product of an 
arbitrary number of Boson fields with one Fermion field. This includes 
the possibility of applying also derivatives of the fields with respect to 
the coordinates, of a finite order, as they are formally of the class n + m 
even. 

However, the transformation law (31) is not generally preserved for a 
product of two or more factors of the class n+ m odd (Fermions). 
Consider a product of N Fermion factors corresponding to symbols 


(n,, m,) withk=1,...N. 


N 


Putting N 
n= > m= > mM 
k=1 


k=1 
we have to compare 
., {(—)(— 1)" for N odd, 
eee NN Gf” n 
oo 2) oo (— 1)” for N even. 
This gives an extra factor 
(— i)*~-! for N odd, 
(—i)* for N even. 


* With respect to the L,, y transforms in the same way as pQy, = pC, with Q 
defined by (14). Compare ref. [1]. 


50 Ww. PauLi 
With a new integer v this can also be written 


both for N = 2y + 1 (odd) 
and for N = 2p (even) 


(=) (— 1). (32) 

To dispose of this extra factor we have now to apply the second 
quantization with .anticonmutators for Fermions*, commutators for 
Bosons. This enables us first to assume that all products of field quantities 
have to be antisymmetrized for all permutations of the Fermion fields and 
symmetrized for all possible positions of the Boson fields} (ordering of 
products). 

Then we must apply for SR the inversion of all products (read them 
from right to left instead of from left to right) in addition to the trans- 
formation (31) for the original field quantities. 


Now we prove the Lemma: 


If the transformation (31) is applied to the original field quantities, the 
same law (31) holds after the application of inversion for any covariant 
with respect to Ly constructed with ordered products of the original field 
quantities and their derivatives of finite ordert. 

To prove the Lemma we simply observe that the sign defined by (32) 
is also the character|| of the permutation 


(1,2,... « NV) > GN — 1, . . .. 2) 


and that the inversion multiplies any product by this character (the 
Bosons do not play any role here). Therefore the extra factor (32) is 
just cancelled by the character of the inversion. 


The Lemma is sufficient to guarantec the existence of an SR for all 
L, invariant local field theories in which the ordering of the products 
is properly taken into account, as any L, covariant field equation puts 
a quantity of definite tensor or spinor character equal to zero and as the 
invariance of all scalars with respect to SR according to (31) extends to 
arbitrary functions of scalars only. 


The considerations of this section could be generalized in two respects. 
First WIGNER’s irreducible representation of infinite degree for Ly having 
zero rest mass, which introduces a continuous variable instead of the 


* It is included here that kinematically independent Fermion fields also anticommute. 

t Compare the rule of LOprrs [13] for it, which was quoted in the introduction. 

} It is not even necessary that the original ficld quantities or the finally constructed 
quantities belong to an irreducible representation of the Ly. It suffices that they belong to 
a certain class among the four possible ones. 

|| By the character of a permutation is understood as usual the sign which is + 1 or 
— 1 according to whether the permutation is even or odd. The sign, defined by (32) can 
also be written (— 1)" (N-1)/2, 


Licchumen Pemeghe Loceatz Groap and Space-tune Reflection 5] 


ipin-imdex could aloo be taken inte consideration. We did not discuss 
‘hie bere bocamuc the representation does not seem to have any con- 
nection with physics. 

Saondly an extension of these considerations to non-local theories 
eould be furtwer investigated. As the form factors are usually not con- 
udered a2 new fields but supposed to be scalars the additional assump- 
LO i. WLS oye ary Were What the form factor be invariant for a reversal of 
the signs “of all ‘iediiian: 


ACKNOWLEDGMENTS 


I arn indchrcd to RB. Jost and G. Livres for interesting discussions 
on the generality of the “strong reflection.” J am also indebted to 
M. Grit-Mans for valuable information on the genera] attitude of the 
group of workers a the Princeton University towards the problems of 
peflecnons. The latter discussions took place during my stay at the 
Frenen Summer School for theoretical physics in Les Houches during 
the summer of 1952. 


REFERENCES 


{i} W. Pauut; Inst. H. Poincaré Ann. 6, 137, 1936 
[2] W. Pau; Phys. Rev. 58, 716, 1940 
(3] W. Pauu; Rev. Mod. Phys. 13, 203, 1941 
[4] W. Pauw and F. J. Beinrante; Physica 7, 177, 1940 
(5) W. Pau; Progr. Theor. Phys. 5, 526, 1950 
(6) G. KAittn; Helv. Phys. Acta 2S, 417, 1952 
{7] D. T. Lee; Phys. Rev. 95, 1329, 1954 
14) W. Hemennes: Z. Phys. W, 209 and 92, 692, 1934 
[9] J. Scuwincer; Phys. Rev. 82, 914, 195) 
(JG) bk. Weep; Gate. Sachr. (Math. Naturw. Klasse) 1932, p. 546 
fli 7. DD. “nero, bP. Wiese; Ber. Mod. Phys. 21, 400, 1949. 
[12] S. Watanane; Phys. Rev. 84, 1008, 1951 
(J?) G. Sones; Det. Kony. Danske Videnskabernes Selskab, Mat.-fysiske 
Meddelelser, 28, Nr. 5, 1954 
f14) G. Litvme, KR. Onsewer. and W. Eb. Tumaisc; Z. Naturforsch. Ta, 213, 1952 
[15] A. Pais and R. Jost; Phys. Rev. 87, 871, 1952 
{16} G. Liipegs; Z. Phys. 133, 325, 1952 
(17! 1. C. finonerne and M.b. Rosk; Phys. Rev. 83, 459, 1951 
(12) HI. A. Sorsorx and S. R. vk Grout; Phys. Rev. 84, 151, 1953 
{19} J. Fsomwnnn and f. P. Wiewse: Proc. Nat. Acad. U.S.A. 27, 281, 1941 
(20) L. Micwer; Nuovo Cimento 10, 319, 1953 


ON THE QUANTUM THEORY OF 
; FREUDS 


L. D. Landau 


In the last seven or eight years, owing to the notable work of SCHWINGER, 
FEYNMAN, Dyson and others, quantum electrodynamics has made very 
great progress. Perturbation theory has been reconstructed in a rela- 
tivistically invariant way, and it has been shown that this can be used 
as a foundation on which to develop an unambiguous procedure 
leading to finite expressions for the effects for which infinities were 
previously obtained. Moreover, by means of the same procedure it has 
been possible to calculate, in closed form, the corrections to any order. 
The striking agreement between the results thus obtained and experi- 
ment has fully confirmed the correctness of the methods which have 
been developed. 


Thus an electrodynamics, based on the concept of a point interaction 
described by the product of operators at one and the same point in 
space, could be freed from infinities, and in consequence its field of 
applicability was extended in a remarkable manner. 


The situation in the theory of z-meson interaction is very different. 
For some types of interaction (in particular, pseudovector coupling), 
the removal of the infinities in the same way as for electrodynamics 
was found to be quite impossible. In the case of pure pseudoscalar 
coupling, where the plan could be formally carried out, a comparison 
with observation has shown that the value of the coupling constant 
renders impossible any application of perturbation theory. On the 
other hand, the existing theory offers no possibility of calculating 
quantities except in the form of series in perturbation theory. These 
are most probably asymptotic, and so, for large values of the coupling 
constant, they give no information concerning the quantities they 
represent. : 

However, even in electrodynamics, the method at present existing 
for the removal of infinities has retained, to a considerable extent, the 
nature of a recipe, and this is a serious obstacle to the further develop- 
ment of the theory. We shall therefore attempt, first of all, to explain 
the existing theory without using quantities which are actually infinite, 

52 


On the Quantum Theory of Fields a0 


and we shall thus be able to determine, at the same time, the limits of 
applicability of this theory, which is usually, but unjustifiably, assumed 
to be unrestricted. 


Since the consideration of a point interaction leads at once to 
infinities, it appears reasonable to regard it as the limit of some 
““smoothed-out”’ interaction with a finite radius, as this radius decreases 
to zero. In doing so, we have no reason to suppose that the constant e,, 
which appears as a coefficient in the interaction and is the “‘intrinsic”’ 
charge of the electron in quantum electrodynamics, is independent 
of the radius of interaction. Furthermore, the dependence of this 
constant on the radius of interaction should be so defined that the final 
result of the theory (the expressions for physical effects) is independent 
of the radius of interaction, since this was introduced as an auxiliary 
quantity and is therefore devoid of physical significance. 


An approach of this kind means, essentially, the rejection of any 
unjustified consideration of point interaction by means of a 6-function. 


All the divergences in electrodynamics are logarithmic, as is well 
known. The only exception is the quadratic divergence in the intrinsic 
mass of the photon. This latter fact, however, is not a serious objection, 
for it is easy to see that the appearance of a mass of the photon, under 
the influence of its interaction with charged particles, contradicts the 
law of conservation of charge. If the “‘smoothing-out” is effected in 
such a way that the law of conservation of charge is not thereby 
violated the corresponding expressions should reduce to zero identically. 


The logarithmic divergence of the integrals in perturbation theory, 
which are taken over the momenta of virtual particles, always occurs 
in the range p <k < A, where k is the variable of integration, A ~ 1/a 
is the upper limit of integration, a being the order of magnitude of the 
radius of the “‘smoothed-out” interaction, and p is the order of magni- 
tude of the four-dimensional momenta considered, if it is large com- 
pared with the mass m of the electron. If |p?| << m?, the lower limit of 
the logarithmic range of integration is m. If p® for the electron is close 
to m*, an additional logarithmic range occurs, connected with what is 
called the “infra-red catastrophe,”’ which we shall touch on below. 


It is well known also that the degree of the logarithmic divergence 
nowhere exceeds the order of perturbation theory that is being applied, 
that is, the quantity log, (A®/p”) enters any expression in a degree not 
greater than that of e?, the square of the charge*. We emphasize that, 
in such an approach to the problem, the charge e, is the unobservable 


* For simplicity, we shall always write log, (A2/p”), with the understanding that, if 
|p?| < m?, it should be replaced by log, (A?/m?). 


54 L. D. LANDAU 


charge of the electron, which depends on the value of A chosen, and in 
no way coincides with the physical charge e. 

Thus the convergence of a series in perturbation theory is directly 
connected with the value of the parameter e? log, (A?/p?). The con- 
dition that perturbation theory is applicable over the entire range is 


thus 
e? log, (A2/m?) < 1. (1) 


As is well known, the results hereby obtained can be renormalized, 
that is, if the physical charge of the electron is defined by its interac- 
tion with quanta of zero frequency and its mass, as the physical mass 
of the electron, the undetermined constant A disappears from the 
formulae for the physical effects. The quantities e and m are then 
expressed in terms of the “‘intrinsic” e, and m, in the form of series 
in powers of e? and log, (A?/m?). 

This renormalizability of the theory is in reality only approximate, 
since terms of the order of p?/A? are neglected. Since, in applying per- 
turbation theory, A is restricted by the inequality (1), this error cannot 
possibly be made arbitrarily small. However, it is easy to see that its 
order of magnitude is exp (— constant/e’), that is, it is extremely small, 
because e” is small. We note that the exponential character of the error 
is an additional argument in favour of the asymptotic nature of the 
series. 

If the condition (1) is not fulfilled, the use of perturbation theory 
becomes impossible. The problem therefore arises of finding the 
fundamental quantities without the direct use of perturbation theory. 
In the work of L. D. LANDAu, A. A. ABrikosov and I. M. KHALAT- 
NIKOV [1-4], such a calculation was performed on the assumption that 


eF < I, (2) 


which is a much weaker condition than (1). For the calculation it is 
necessary to take into account that, when the condition (2) is fulfilled, 
we can neglect all terms which, for a given power of the logarithm, 
contain higher powers of e? than the principal terms. As we have 
already said, the lowest power of e? is the same as the power of the 
logarithm. Hence, in order to obtain the first approximation, it is 
sufficient to confine ourselves to a combination of terms of the type 
[e? log, (A2/p2)]". 

As base functions it is natural to take, in the first place, the Green’s 
function of the electron G(p) and that of the photon D,,(k), that is, 
the exact expressions for the electron and photon lines, taking into 
account the corrections of all orders. These functions are expressed 


On the Quantum Theory of Fields 55 


in terms of the corresponding operators y and A for the electron and 
the vector potential. In a co-ordinate representation, 


Diol — x') = qe <T(A,()AC))) 


Gx — x‘) = — 1KT(v()9%’) > 


where the average is taken over the physical vacuum, and the symbol 
T denotes that the product is taken in the order of the time sequence of 
x and x’ and, in the case of G, with the appropriate sign. 


To determine these functions, it is necessary to use also the vertex 
part Tp.g3 k) (where g = p — k). Here it can be shown that, to 
determine the functions G(p) and D(k) for the space vectors p and k, 
it is sufficient to use those I’, in which all three vectors are also spatial 
[1], [2]. Since G and D for time vectors can be found from G and D 
for space vectors by analytic continuation, it suffices to consider only 
“spatial” I’,. In such I’, either the quantities p?, g* and k? are of the 
same order of magnitude or, if one is comparatively small, the other two 
are close to equality. The case where one of the three quantities is large 
compared with the other two, which is quite possible for non-spatial 
vectors, is here excluded. For such I’,,, the series in perturbation theory 
contains only logarithms of the type log, (A?/f?), where f? is the greatest 
of p*, q? and k*, and then only in powers not greater than that of e?. We 
note that this does not in general hold for non-spatial I’,. In particular, 
in the case where k? is large compared with p? and q?, log, (k?/p?) and 
log, (k?/q?) occur in the formulae, and their common power may be 
double that of e7 [5] (this question has been analysed in greater detail 
by V. V. SupDAKov [6)]). 

The Green’s function for the photon may, from considerations of 
relativistic invariance, be always written in the form 


(3) 


kk. 
Dl) = D&W) (buy — “Eet) + DyGE) “EE (4 


In using perturbation theory, it is usually supposed that in the zero- 
order approximation D? = D? = 1/k®. As a result of the perturba- 
tions, D, changes, whilst D, remains equal to its zero-order value, 
because of the transverseness of the Dirac current (conservation of 
charge). Such a choice of D,, however, is not at all convenient. It is 
important that, in consequence of gauge invariance, the choice of the 
function D,(k?) is in general arbitrary. This means that the expressions 
for the physical effects are independent of D,. The quantities which we 
are considering, apart from D,, do not exhibit gauge invariance. The 


56 L. D. LANDAU 


gauge invariance of D, follows from the fact that the transverse com- 

k,,k,A, : 
ponents A, (4, sae do not vary under gauge transformation. 
On the other hand, the operator y does vary under gauge transforma- 
tion, and the quantities G and I’, change correspondingly. 

We shall not discuss these changes here. The general theory of gauge 
transformations developed by I. M. KHALATNIKOV and the author, 
allows us to calculate G and I’, in terms of their values for D, = 0, in 
the case where D, is arbitrary. In what follows, we shall therefore use 
the value D, = 0, which considerably simplifies the formulae and is 
usually the most natural procedure (it corresponds directly to the 
Lorentz condition 0A,,/0x, = 9). 

If we put D, = 0, it can be shown that, on applying perturbation 
theory to the quantity I’, (which is spatial), the terms of the type 
[e% log, (A?/p?)]", which we are considering, reduce identically to zero. 
The same is true of the radiative corrections to G, if we suppose 
| p?| > m®, i.e. we neglect the corrections to the mass. 


To determine D,, we can now use Dyson’s equation [7] 


oe 
D, (1 4 = SpIlGPl (psp —k; Gp — ky, a'p)) = ||, 


where we can substitute I, = y, and G = 1/(yp — m). We then obtain 
1 


2 
1+ = log, (=a) 
In the case k? < m? we have log, (A?/m?) instead of log, (A?/ — k?) in 
this formula. For k? = A? (5) gives D, = 1/k*, which corresponds to 
the fact that, for values of k?5> A’, interaction is absent, and the 
particle behaves as if free. 

Formula (5) is, in the following familiar sense, conditional in nature. 
In its derivation it was supposed that the only particle which takes part 
in vacuum polarization is the electron, which most probably is not 
true. It is very probable that the interaction, with the electromagnetic 
field, of the particles capable of strong non-electromagnetic interaction 
(nucleons, z-mesons) decreases rapidly with wave-length beyond the 
“radius” of the particle, so that they do not make an appreciable 
contribution to vacuum polarization. However, it is possible that there 
are particles (u-~mesons’?) incapable of strong interaction, which make 
an important contribution to vacuum polarization. At present, nothing 
can be asserted regarding the number of kinds of such particles or their 
properties. 


(@ =B-B-B—K 6) 


KD {k*) = 


On the Quantum Theory of Fields — Sif 


If vacuum polarization is effected only by particles with spin } and 
charge equal to that of the electron, then, in the formula (5), a coefficient 
appears equal to the number » of kinds of particle: 


1 


PD {k*) = ) (|k2| > m'), 


2 
1 ani — log, ("2 Be 
Veg BM (oe) Vee = : 


ver A? 
ee log, (7 :) 


Particles with charge Ze evidently make a contribution of Z? to ». 
Particles with spin 0 make a contribution of }Z? to », as is shown by a 
comparison with perturbation theory ((8]; this question has been 
analyzed in greater detail by L. P. Gor’kov and J. M. KHALATNIKOV). 
In the case of particles with spin 1, the divergence which arises is not 
logarithmic but quadratic [9]. We shall examine this situation more 
closely below. 

An analysis of physical effects shows that the physical charge e is 
related to e, by 


(6) 


(ic | me). 


== eu lim A RED {i (7) 


We therefore obtain from formula (6) 
2 


2 Zl 
ae 
+ gee OES None 
a 
of = : ve TA2\_ (9) 
oo O8e m2 
Expressing the function D, in terms of e? we have 
1 
ke Dk =e (10) 


2 k2 
i 3 log, (= Fy 


Formula (10) is evidently renormalizable, that is, we can replace the 
unobservable charge e, by the physical charge e, in every formula, by 
multiplying the function D, by the renormalizing factor Z, = e{/e*, 
whereupon all quantities will no longer contain the cut-off limit A or 
the charge e, related to it. 


5 


58 L. D. LaNnpAu 


The charge e, defined by formula (9) is always greater than the charge 
e. This is natural, since vacuum polarization should lead to a decrease 
of the “‘original’’ charge. This result is related to the properties of the 
function D,, which increases monotonically with k?. This is one example 
of a general property of Grecn’s functions, which was found in the 
works of KALLEN [10] and LEHMANN [11]. 


Formula (10) satisfies the condition derived in the work of GELL- 
MANN and Low [12]. This condition can be derived particularly 
simply by starting from the concepts explained here. 


The above theory shows that, in calculating the base functions 
corresponding to some values (k*)o, it is sufficient to consider a range 
of values of k® of the order of (k*)) and, in particular, large compared 
with this quantity, The range of values of k? < (k*)) has only a negli- 
gible correction effect (actually of the order of [k?/(k*)o]*), since the 
corresponding range of integration is small, and the integral con- 
verges well for small k. We can thus “‘move down’ from the limit A 
to the values of k in which we are interested, and take no notice, in the 
calculation, of much smaller values of k. 


It follows from this that if we consider values of k? > m?, the mass 
cannot appear in the formula for D,. Hence, from dimensional 
considerations, the dimensionless quantity k*D, can depend only on 
the two dimensionless quantities e? and k?/A?. On the other hand, 
the quantity e7D, determines the physical effects, and therefore, for a 
given value of the physical charge e, it cannot depend on the cut-off 
limit A. This is possible only if 


— 2 
etk*D, =f Ie ue] ? 
where f and y» are arbitrary functions, and the relation 
we 
x2 ved = xe) 
holds, y being again some arbitrary function. Introducing the function 
¢ which is the inverse of f, these relations can be conveniently written 


in the form _ ke 


m? 


— k2 
HKD.) = Sr we) =—F xe. (11) 


A comparison with formula (6) shows that, in the approximation 
we are considering, 
P(x) = Y(x) = 4(x) = eS, 


These formulae are to be regarded as the asymptotic values of the 


On the Quantum Theory of Fields 59 


functions for small x. The results in paper [13] allow us to calculate 
the next approximation also. Here we obtain 


d(x) — Gar ree a: 
x(x) = pore **. 


There is no sense in refining the difference between y(x) and ¢(x), since 
the nature of the “smoothing-out”’ is seen therein. 


If we considered higher approximations, we should obtain 
P(x) = e~Prrex 1M F(x), (12) 


where F(x) is a series of powers of x, and a similar expression for y(x). 
We notice that the presence of an exponential term, which is not 
decomposed into a series, is some argument in favour of the asymptotic 
character of the series F(x). 


Formula (6) has a range of applicability which is considerably wider 
than that of perturbation theory. However, even (6) does not make it 
possible to conclude the process of passing to the limit. As follows 
from (9), for sufficiently large A there is always a point where e, 
becomes of the order of unity, and the application of the approximation 
concerned becomes impossible. 


Thus ‘‘weak-coupling”’ electrodynamics is a theory which is, funda- 
mentally, logically incomplete. It might be thought that this makes it 
necessary to supplement it by “‘strong coupling” at high energies. We 
shall show, however, that there are serious reasons for supposing that, 
if we regard the physical charge e as a function of e, and A, then, fora 
given A, no increase in e, can lead to an increase of e above some 
limit, which tends to zero as A increases*. 


To show this, we consider values of e, which are not too small 
compared with unity, and, at the same time, values of k which are 
small compared with A, such that (ve?/3s) log, (A%/k?)> 1. It 
follows then from formula (6) that D, = 37/ve?k* log, (A?/— k®). We 
introduce, instead of the vector potential, the quantity UW, = e,A,,. 
Then the interaction term in the Langrangian will not contain the charge 
e,, while the term corresponding to the Langrangian of the free field 
will contain e? in the denominator. The function D, corresponding to 
the vector & takes the form 


3 
», a 


: ~ vk? log, (A2/— k2) 


* The considerations given in what follows are due to I. J. POMERANCHUK and the 
author [14]. 


(13) 


60 L. D. LANDAU 


This expression does not contain the charge e,. Hence it may be thought 
that it is obtained essentially by neglecting the free field term in the 
Lagrangian. It is difficult to imagine that the legitimacy of neglecting 
this term would diminish as e, increases further. 


The applicability of formula (13) for large e, means that the charges 
€, > 1 polarize the vacuum to such an extent that the “‘effective 
charge”’ +/[e7k?D,(k?)] becomes of the order of unity even if the ratio 
A?/k? ~ 1. In other words, if within some radius a there is concen- 
trated an arbitrarily large charge, then, owing to vacuum polarization, 
the total charge inside a radius 2a will be of the order of unity. 


Let us consider the physical charge e as a function of e, and A. 


Since we suppose that, as e, tends to infinity, D, tends to the limit (13), 
we have for the physical charge (see (7), (8) ): 


3a 
2 ogee Ses eee 
‘ ~ » log, (A?/m?) a 


Since this expression tends to zero as A —> ©, we reach the conclusion 
that within the limits of formal electrodynamics, a point interaction is 
equivalent, for any intensity whatever (e7 > 00) to no interaction at all. 


It is curious that a completely paradoxical situation has resulted. 
For 25 years it was supposed that the use of the 6-function leads 
inevitably to infinite interactions. However, formula (8) shows con- 
vincingly that the 6-function (e, independent of the cut-off radius) 
leads in fact to zero interaction, and even an unlimited increase of e, 
does not seem to save the situation. 


Conversely, the theory considered, for a given physical charge e, 
seems to have a “‘ceiling,” in that it cannot in principle be used to 
discuss an energy greater than A,, the value of A for which e, —> 0 
(as we have already said, this practically coincides with the value 
corresponding to e, ~ 1), or consequently a distance less than 1/A,. 

Of course, no unambiguous physical conclusions can be drawn from 
the result obtained, that the point interaction is zero in the case of 
electrodynamics. The energies A for which e? ~ 3z/y log, (A?/m?) are 
in every case very large. At these energies, the effects of gravitational 
interaction may exceed the electromagnetic effects, so that a discussion 
of electrodynamics as a closed system becomes physically incorrect. 
The idea is very attractive that this “crisis” in electrodynamics occurs 
for just those energies where the gravitational interaction is comparable 


with the electromagnetic. Since the effective charge in the critical range 
is of the order of unity, this means that 


KkA2 ~ 1, 


On the Quantum Theory of Fields 61 


where « is the gravitational constant. This gives a value of the order of 
108 eV for the critical A. Using formula (14), we then obtain » ~ 12. 
From this standpoint, the value of the physical charge e of the electron 
would be automatically determined by the theory. At present, of course, 
it is impossible to say whether these ideas have any real significance. 
It is quite possible, in particular, that y < 12 and that the gravitational 
effects appear considerably before the effective charge becomes of the 
order of unity*. 


If particles with spin | take part in vacuum polarization, the situation 
is essentially different. In this case, as we have said, the divergence is 
not logarithmic but quadratic. This leads to the result that the diffi- 
culties mentioned above arise, not for extremely large values of A, but 
for those such that e?A?/M? ~ 1, where M is the mass of a particle. 
The probable way out of these difficulties, given above, then becomes 
clearly impossible. We shall discuss below the situation which would 
arise in this case. It is evidently still more pronounced for particles of 
higher spin [15]. Here we merely note that there is no evidence of the 
existence of such particles. 


Let us now pass to the mass of the electron. In order to find the 
telation between the physical and “‘intrinsic’? masses, we write the 
Green’s function of the electron in the form 


1 


G(p) = ————: 15 

(p) yp — m(p2) (15) 
For large values of p?, the ratio of the second term in the denominator 
to the first becomes totally negligible, and is always much less than the 
error in formula (15). Nevertheless, it is justifiable to consider this 


term, since, unlike the first, it is an even function of the momentum. 


We now write Dyson’s formula for G [7]: 


e2 
Gp) = yp — m, — =. JT p.p — ks K)G(p — k)y,D, fk) a*k, (16) 


and substitute the expression (15) for G (for I’, we write y,, as before; 
the legitimacy of this is closely examined in [4]). Then, after some 
calculation, we obtain 
3e2 tr 
m(p?) = m, +73 i _m{k2)D (k?)d(e). (17) 


Pp 


* We do not consider here the possibility that other non-electromagnetic interactions 
begin to play an important part at high energies. 


62 L. D. LanpAu 


This integral equation is solved by elementary methods, and after 
substituting the expression (6) for Dk?) we obtain 
2 


A2 O/4v 
ip?) =n, [ oi = log, (ral for ip? | Sm". (18) 


The physical mass is obtained from this by putting p® ~ m’, so that 

the relation between m, and m is given by the formula 
e2\ 9/4” . 

m=m\s . 19 

i=m(5) md 

According to this formula, mm, decreases when A increases. This 

makes it reasonable to suppose that if we were able to extend the 

theory to A -> 00, m, would reduce to zero; this would mean that the 

mass of the electron is electromagnetic in origin. Thus, these concepts 

lead to a peculiar return to the long abandoned idea of a purely 

electromagnetic mass of the electron. 


A special case is formed by the electron Green’s function G and the 
vertex parts I',, when the squares of the momenta of the electrons 
approach m*, a circumstance which is related to the so-called ‘“‘infra- 
red catastrophe.” This topic was discussed by A. A. ABRikosov [16], 
who showed that the corresponding value of the function G is 

lee 
( a iar (20) 


neg = — 
m 


Contrary to general opinion, this function has at p* = ni? not a simple 
pole, but a branch point. We point out that, for D, 4 0, the power of 
the denominator depends on lim k?.D,(k’). 

k?—0 


The expression for [',(p,q; k) (where g = p — k) is different from y, 
only in the case where both p* and q? are close to m®. We shall not 
write it out here. 


As has already been said, the theory given above does not apply to 
I.{p.q; k) when the smallest of p?, q? and k? is very small compared 
with the largest, and the two largest are not very close to each other. 
Particularly important effects are found when the inequalities k? > p? 
and k**>q* hold. In this case, terms appear in the formula which 
depend on the product e? log, (k?/p”) log, (k?/q"), and increase con- 
siderably more rapidly with k than the terms considered above. The 
analysis carried out by V. V. SUDAKOv [6] has shown that the effects 
which occur here are also connected with the infra-red catastrophe. 


If we consider physical effects at very high energies, it is necessary to 


On the Quantum Theory of Fields 63 


take into account the fact that the diagrams of higher orders, which 
are not given here (i.e. those which do not reduce to a variation of G, 
D and I in the simplest diagram), although they do not lead to diver- 
gences, contain the logarithms of the ratios of the energies of various 
particles. Hence, in calculating the corresponding effects, we must, 
following the method developed above, sum the diagrams which, for a 
given order of perturbation theory, contain the highest powers of the 
logarithms. We note that here also the degree of the logarithm is 
usually twice that of the diagram. 


In the consideration of the Compton effect (given by A. A. 
ABRIKOSOV [16], it is found that the chief difference compared with the 
elementary theory concerns the infra-red catastrophe. The Compton 
diagram with electron termini, for which p* = m?, reduces to zero, as 
would be rigorously true. Here it can be shown that if we consider the 
emission of additional photons, the total effective cross-section of such 
a Compton effect is given, to the approximation concerned*, simply by 
the Klein-Nishina formula. It should, however, be mentioned that the 


amplitude of ee scattering (scattering at 0 = 0) is multiplied by 


a factor exp (= int) increasing with the energy of the photon. 


Let us now pass from quantum electrodynamics to another funda- 
mental problem of quantum field theory, the theory of meson interac- 
tions. The situation here differs radically from that in quantum 
electrodynamics. Whereas quantum electrodynamics permits us to 
calculate even small corrections, meson theories have essentially given 
no quantitative results at all which are correct. 


The chief cause of this failure of meson theories is the strength of 
meson interactions. The interaction of an electron with an electro- 
magnetic field is weak. The corresponding dimensionless constant, 
involving the charge on the electron, is very small (e?/37 ~ 1/1000). 
On the other hand, all the experimental data on meson interactions 
show that they cannot be regarded as weak from any point of view. 


Numerous attempts at quantitative calculations of meson interactions 
have been chiefly concerned with two types of interaction, pseudoscalar 
and pseudovector. Both these interactions relate to mesons which, as 
is shown by experiment, have zero spin and odd parity with respect to 
the proton and neutron (for the 7°-meson the odd parity is absolute); 
these interactions are written down similarly to those of particles with 
an electromagnetic field, where the meson field (pseudoscalar coupling) 


* We notice that since, as we have said already, terms with squares of logarithms 
appear in this case, it is here assumed that the product of e? and the logarithm is small 
compared with unity, in contrast to what was assumed in the preceding part of the article. 


64 L. D. LANDAU 


cr iis gradient (pseudovector coupling) plays the part of the vector 
potential. The pseudovector coupling cannot be renormalized (we 
shall return to this question in more detail below). Attempts to apply 
perturbation theory to pseudoscalar coupling have led only to the 
result that the values of the corresponding constant, calculated from the 
appropriate experiments, differ from one another by factors of ten or 
even a hundred. We notice also that the corresponding dimensionless 
constant g*, which is analogous to the charge, is of the order of 10 to 15 
in the majority of effects, and this completely rules out the possibility 
of applying perturbation theory. 

Despite the fact that the consideration of weak meson interactions 
has thus proved physically inadequate, we shall nevertheless analyse 
the situation which arises here, on the grounds that such an analysis is 
essential in order to understand the problem itself. 


We begin with weak pseudoscalar coupling. Here too only logarith- 
mically divergent integrals occur (apart from the rest mass of the meson). 
We can, therefore employ the approach explained at the beginning of 
this article. However, the simplification which occurs in electrody- 
namics for D, = 0 as regards the vertex part I’, which was found to be 
simply equal, in the approximation considered, to its “zero-order” 
value, is here absent. In the approximation considered, however, it is 
found to be possible to sum all diagrams giving effects of the required 
order (just as, in electrodynamics, we discuss “‘spatial’’ I"). It can be 
shown [1] that, in this case, I is the sum of its zero-order value y; and 
diagram 1, where all lines correspond to the exact Green’s functions 


\ 
| 
| 
| 
| 
I 


rig. 


G and D, and all angles to exact values of IT. All additional corrections 
to this diagram contain the powers of g? which exceed the power of 
the logarithm by at least unity. 

As well as Dyson’s equations, we thus obtain a whole system of 
equations from which the functions G, D and I can be determined. 


On the Quantum Theory of Fields 65 


The solution of this system is fairly complicated, particularly so since 
it is necessary, in calculating G and D. to take account in ['(p,p — k; k) 
of small correciions of order p/k for k >> p, and of corrections of order 
k?/p? fork <p. 

This question has been considered in detail in the work of A. A. 
ABRIKOSOV, A. D. GALANIN and I. M. KHALATNIKOV [17], where it was 
shown that for charge-symmetric theory 


1 zs A2 \}-3/10 
a 1 foarte. 
ae pam! + ae | ee ) 
1 A2 \]-4/5 
mints oe (2 ia) | 
Bee A2 \ "5 
Ngpeperatk ; key’. [ > ~~ log, | (21) 


where f is the greatest of p*, (p — k)* and k?, A is the upper cut-off 
limit, and g, is the value of the unobservable constant of the pseudo- 
scalar interaction. All these formulae are written for p? > M?, k? > M?, 
f?> M’. If these inequalities do not hold, we must write log, (A2/M?) 
in each formula. 

This theory can be renormalized. The value of the physical constant 
g which characterizes the interaction at small energies is obtained, by 
comparison with physical effects, in the form 


gs = si(vsl Pp — M)GP(RD), (22) 
where all the quantities are taken for small energies. This gives 
2 
4 °\ iM 
2 
or a 5¢2 ae (24) 
L= a log, (sa) 


Substitution of (24) in (21) gives 


1 gy ( De (5 p )| —3/10 
G{p) = pu E 1 — Fiz log, 2 , 
1 2 5g? a k2 
Dk) = ke E (1 rs log. | 47a 


2 5 2 — f* 1/5 
['(p,p —k>k)=y, E (1 #108. (577) j 


66 L. D. LANDAU 


These formulae, disregarding the inessential constant factors, depend 
only on g, and not on g, or A. 

The formulae (23), (24) which renormalize g coincide exactly, apart 
from the coefficient, with formulae (8), (9) which renormalize the charge. 
In particular, it follows from them that g, increases with A, and this 
corresponds to a strengthening of the interaction as the energy increases. 
However small the value of g, we always reach a region, for sufficiently 
large A, where g, ~ 1, that is, the interaction cannot be regarded as 
weak. 


While discussing the pseudoscalar theory of meson interaction, let us 
examine also the problem of the scattering of one meson by another. 
It is generally asserted that this phenomenon, not being renormalizable, 
cannot be considered within the limits of the theory. To remove the 
non-renormalizability, an attempt is made to introduce additional 
terms into the Lagrangian [18], but then a new undetermined constant 
appears in the end. 


Let us see how the problem of the scattering of one meson by another 
appears from the standpoint of the present article. In applying pertur- 
bation theory, the amplitude of scattering of one meson by another at 
small energies apart from a numerical factor, is taken as gf log, (A?/M7”). 
This result is obviously not renormalizable, that is, for a given value 
of g, it depends also on the cut-off limit A. This non-renormalizability, 
however, must not be regarded as a defect of the theory, but has a 
definite physical significance. The introduction of the cut-off means 
that virtual particles with energies greater than A are excluded from 
consideration. The renormalizability, for instance, of the scattering 
of mesons by nucleons should really be regarded only as signifying that 
only virtual particles with energies of the order of those of the colliding 
particles take part in this phenomenon. In the scattering of mesons 
by mesons, however, virtual particles of larger energies also take part. 


It is clear that, to obtain a concrete result, it is necessary that the 
part played by particles of various initial energies should diminish from 
some energy onwards. If a theory of pseudoscalar meson coupling can 
exist, then, as we have seen, the weak coupling must pass into a strong 
one at some energy. It is natural to suppose that the boundary where 
the weak coupling becomes strong is the most important energy in 
this case. 

Thus we arrive at the following method of solving the problem. It is 
necessary to obtain an expression for the scattering of one meson by 
another, taking into account all corrections of relative order 
[g? log, (A?/M?)]”, and to substitute in it g? ~ 1 and log, (A2/M?) = 
47/5g”. However, it is found that, to take into account corrections of 


On the Quantum Theory of Fields 67 


the order required, it is necessary to sum a set of “parquet” diagrams 
consisting of an infinite number of nucleon squares joined by meson 
lines, and it is laborious to find the corresponding formulae. 


Let us now examine the problem of pseudovector coupling. This 
coupling differs noticeably from pseudoscalar coupling in the following 
respect. In pseudoscalar coupling (as in electrodynamics), the coeffi- 
cient in the interaction operator is a direct measure of the coupling 
strength. It is clear from a comparison with pseudoscalar coupling 
that in pseudovector theory a similar part is played not by the coefficient 
f in the interaction operator, but by fk/M. In other words, pseudo- 
vector coupling with a small constant f, unlike pseudoscalar coupling 
for a given g,, automatically gives strong coupling at high energies. 


This state of affairs has the result which is usually called the non- 
renormalizability of pseudovector coupling. All the divergences here 
are not logarithmic, but quadratic, and the absence of the logarithmic 
situation completely changes the character of the theory. The charac- 
teristic feature of the variants considered above was the very slow change 
of the coupling strength with energy. Between perturbation theory and 
the strong coupling lay a region of “‘renormalizable”’ theory, which was 
large for small g. In the case of pseudovector coupling, the region 
between perturbation theory and the strong coupling is completely 
absent. We note that the position is quite similar to that in the electro- 
dynamics of particles with spin 1 (and probably with higher spins). 

Thus we reach the conclusion that, in all the variants of meson 
theory, we inevitably find strong coupling at high energies. Since 
meson couplings are in reality not weak even at energies of the order of 
Mc*, this means that the coupling becomes strong even for E> Mc?. 
Here we are speaking, not of a coupling of a given intensity, but of the 
increase of the effective coupling with energy. The construction of 
such a theory of strong coupling at high energies appears to be the main 
problem in this field. The theory of meson showers evolved by FERMI [19] 
and further developed by the author [20] shows that the theory of strong 
coupling must lead ultimately to a hydrodynamic picture. The con- 
sideration of diagrams with a small number of lines in the case of strong 
coupling is clearly insufficient, and diagrams with a large number of 
lines must be an important factor; the future theory must relate such 
diagrams to the equations of relativistic hydrodynamics. 


There is, however, another possibility. We have seen in the case of 
electrodynamics that a point interaction can lead to the absence 
of any interaction, even if its intensity increases without limit. The 
possibility cannot be excluded that this is a general property of point 
interactions. In this case, the construction of meson theories is possible 


68 ~ L. D. LANDAU 


only by abandoning the point interaction, that is, by renouncing 
essentially all the methods at present existing. The great difficulties 
which arise in a physical ‘‘smoothing-out” of particles, as opposed to a 
purely formal “‘smoothing-out” such as was discussed in the present 
article, are well known. In this case, therefore, the theory of meson 
interactions would draw a blank. 


We emphasize that the physical ‘‘smoothing-out” related to the 
introduction of some ‘‘fundamental length” of the order of 107% cm 
must inevitably have some effect on electrodynamics, although at these 
energies no logical difficulties arise. Unfortunately, the chief electro- 
dynamic effects at high energies, namely bremsstrahlung and pair- 
production by photons, occur (in the system where the electron is at 
rest), at energies of the order of its rest mass, so that the study of these 
phenomena at high energies can give no information in this direction. 
An elementary calculation shows that a fundamental length of the 
order of 10-4%cm must lead to important changes in the Compton 
effect or in the annihilation of positrons at energies of the order of 
10'%eV. The study of these phenomena might be of the greatest 
importance to theoretical physics. 


* * * 


It is a great pleasure to me to contribute this article to the present 
volume dedicated to NiELs Bone, the great physicist whose pioneer 
work has determined the immense progress of the modern quantum 
theory. 


Note added in proof. Since this paper was written I. J. POMERANCHUK 
brought forward new arguments in favour of the absence of physical 
interaction for point particles. He succeeded in proving this statement 
rigorously in electrodynamics (as compared to not quite rigorous con- 
siderations given in the text). He gave also strong arguments in favour 
of a similar result for pseudoscalar meson coupling. These important 
results give additional strength to the point of view that meson theories 
cannot be constructed without deep changes in the basic principles of 
modern theoretical physics but of course cannot completely exclude 
the other possibility. 


REFERENCES 


[1] L. D. Lanpau, A. A. Aprikosov and I. M. KHALATNIKOV; Doklady Akademii 
Nauk SSSR 95, (3), 497, 1954 

[2] L. D. LANDaAuv, A. A. ABrikosov and I. M. KHALATNIKOV; Doklady Akademii 
Nauk SSSR 95 (4), 773, 1954 

[3] L. D. LANDAU, A. A. Asrikosov and I. M. KHALATNIKOV; Doklady Akademii 
Nauk SSSR 95 (6), 1177, 1954 


On the Quantum Theory of Fields 69 


[4] L. D. LAnpau, A. A. Asrikosov and I. M. KHALATNIKOV; Doklady Akademii 
Nauk SSSR 96 (2), 261, 1954 
[5] R. P. FEYNMAN; Phys. Rev. 76, 769, 1949 
[6] V. V. SuDAKov; Dissertation, Institute for Physical Problems of the U.S.S.R. 
Academy of Sciences, Moscow, 1954 
[7] F. J. Dyson; Phys. Rev. 75, 1736, 1949 
[8] M. NEUMAN and W. H. Furry; Phys. Rev. 76, 1677, 1949 
[9] D. C. Peastee; Phys. Rev. 81, 94, 1951 
[10] G. KALLEN; Helv. Phys. Acta 25, 417, 1952 
{11] H. LEHMANN; Nuovo Cimento 11, 342, 1954 
[12] M. Ge_t-MaAnn and F. E. Low; Phys. Rev. 95, 1300, 1954 
{13] R. Josr and J. M. Luttincer; Helv. Phys. Acta 23, 201, 1950 
[14] L. D. Lanpau and I. J. Pomerancuuk; Doklady Akademii Nauk SSSR 
102 (3), 489, 1955 
[15] S. N. Gupta; Phys, Rev. 95, 1334, 1954 
{16] A. A. Asrikosov; Dissertation, Institute for Physical Problems of the U.S.S.R. 
Academy of Sciences, Moscow, 1955 
[17] A. A. Asrikosov, A. D. GALANIN and I. M. KHALATNIKOV; Doklady Akademii 
Nauk SSSR 97 (5), 793, 1954 
[18] P. T. MatrHews and A. SALAM; Rev. Mod. Phys. 23, 311, 1951 
[19] E. Fermi; Phys. Rev. 81, 683, 1951 
[20] L. D. LanpAu; Izvestiya Akademii Nauk, Seriya Fizicheskaya 17 (1), 51, 1953 


ON QUANTUM ELECTRODYNAMICS 
L. Rosenfeld 


. et discors concordia fetibus apta est. 
Ovip, Metam. I, 433 


WHEN | arrived at the Institute on the last day of February, 1931, for 
my annual stay, the first person I saw was GAMow. As | asked him 
about the news, he replicd in his own picturesque way by showing me a 
neat pen drawing he had just made*. It represented LANDAU, Ughtly 
bound to a chair and gagged, while Bour, standing before him with 
upraised forefinger, was saying: “Bitte, bittc, LANDAU, muss ich} nur 
ein Wort sagen!” I learned that LANDAU and PrikRLs had just come a 
few days before with some new paper of theirs which they wanted to 
show Bonr, “but” (GAmMow added airily) “he does not seem to agree — 
and this is the kind of discussion which has been going on all the time.” 
PEIERLS had left the day before, “in a state of complete exhaustion,” 
Gamow said. LANDAU stayed for a few weeks longer, and L had the 
opportunity of ascertaining that GAMow’s representation of the situa- 
tion was only exaggerated to the extent usually conceded to artistic 
fantasy. 


There was indeed reason for excitement, for the point raised by 
LANDAU and Peter.s [1] was a very fundamental one. They questioned 
the logical consistency of quantum electrodynamics by contending that 
the very concept of electromagnetic ficld is not susceptible, in quantum 
theory, to any physical determination by means of measurements. The 
measurement of a ficld component requires determinations of the 
momentum of a charged test-body; and the reaction from the field 
radiated by the test-body in the course of these operations would 
(except in trivial cases) lead to a limitation of the accuracy of the field 
measurement, entirely at variance with the premises of the theory. In 
fact, the quantization of the field only entails reciprocal limitations of 
the measurements of pairs of components, arising from their non- 
commutability, but no limitation whatsoever to the definition of any 


* T am afraid this work of art has been allowed to disintegrate before its historical 
value could be realized. 
+ This is a familiar danicism of Bowr’s for ‘“‘darf ich.” 


70 


On Quantum Electrodynamics 71 


single field component. On the other hand, one had to face another 
inescapable consequence of the field quantization, the occurrence of 
irregular fluctuations in the value of any field component; the existence 
of this fluctuating “zero-field” (as it was called because it persists even 
in a vacuum) was known to be responsible for one of the divergent 
contributions to the self-energy of charged particles, but its meaning 
was very obscure. LANDAU and PEIERLS, somewhat illogically, tried to 
bring it in relation with their alleged limitation of measurability of the 
field, and this only further confused an already tangled issue. 


Measurability of electromagnetic fields* 


Bour’s state of mind when he attacked the problem reminded me of an 
anecdote about PASTEUR [3]. When the latter sct about investigating 
the silkworm sickness, he went to Avignon to consult FABRE. ‘‘I should 
like to see cocoons,” he said, “I have never seen any, I know them only 
by name.”’ FABRE gave him a handful: “‘he took one, turned it between 
his fingers, examined it curiously as we would some singular object 
brought from the other end of the world. He shook it near his ear. ‘It 
rattles,’ he said, much surprised, ‘there is something inside.’ ”’ My first 
task was to lecture BOHR on the fundamentals of field quantization; the 
mathematical structure of the commutation relations and the under- 
lying physical assumptions of the theory were subjected to unrelenting 
scrutiny. After a very short time, needless to say, the roles were inverted 
and he was pointing out to me essential features to which nobody had as 
yet paid sufficient attention. 


His first remark, which threw decisive light on the problem, was that 
field components taken at definite space-time points are used in the 
formalism as idealizations without immediate physical meaning; the 
only meaningful statements of the theory concern averages of such field 
components over finite space-time regions. This meant that in studying 
the measurability of field components we must use as test-bodies finite 
distributions of charge and current, and not point charges as had been 
loosely done so far. The consideration of finite test-bodies immedi- 
ately disposed of LANDAU and PEIERLS’ argument concerning the 
perturbation of the momentum measurements by the radiation reaction: 
it is easily seen that this reaction is so much reduced, for finite test- 
bodies, as to be always negligible. 


On the other hand, the construction and manipulation of extended 
test-bodies proved a most perplexing affair. To get a rough idea of the 


* This section contains an analysis of the paper on the subject by BoHR and ROSEN- 
FELD [2], to which the reader is referred for further details. 


Ve L. ROSENFELD 


kind of problems that had to be faced, let us just consider the measure- 
ment of the electric licld component /, averaged over a volume V and 
a lime interval 7) We take a test-body filling the volume V with 
uniform density p and we measure its momentum pl, pi, at the bepin- 
ning and the end of the Ume interval 7. By making the test-body 
sufficiently heavy, we may arbitrarily reduce its displacement during 
this interval, and we thus get for the average £,, 


EF, PVT sap, = Pe (1) 


The momentum determinations of the test-body, however, entail a 
sacrifice of the knowledge of its position to a certain extent Ax, and the 
resulting error AE, on £, will be, on account of (1) and the indeter- 
minacy relation, 
h 
AE, S oRx. VT @) 
This shows that we can reduce the indeterminacy of the field average 
indefinitely by increasing the charge density of the test-body. 

Here we mect with a question of principle directly affecting the 
logical structure of the quantum theory of fields. So long as we treat 
all sources of clectromagnetic ficlds as classical distributions of charge 
and current, and only quantize the ficld quantities themselves, no 
universal scale of space-time dimensions is fixed by the formalism. It 
is then consistent to disregard the atomistic structure of the test-bodies 
and there is no restriction to the logically admissible values of the 
charge density. It remains to be seen whether it is legitimate to isolate 
this theory of quantized ficlds interacting with classical sources from 
the more elaborate formalism in which it is attempted to treat the 
interaction of ciectromagnetic and material fields, both subjected to the 
appropriate quantization. Such a question could of course not be 
decided by formal considerations, since the mathematical consistency 
of the formalism was then very much in doubt. Bor, however, went 
straight to the root of the matter by reminding us that the very formula- 
tion of the fundamental quantum postulates implies a neglect of all 
radiative couplings, which is only justilicd by the actual smallness of the 
fine structure constant. The natural approach to field quantization can 
therefore only be regarded as consisting of a succession of steps, in 
which the effects of radiative interactions are gradually taken into 
account: the first step is just the quantization of the pure radiation 
field, the next step the quantization of the ficld associated with the 
charged particles. Such a step-wise analysis of the content of the 
formalism is thus not merely a matter of practical convenience, but one 
of logical necessity. It is but one aspect of the situation we meet 


On Quantum Electrodynamics 73 


everywhere in quantum theory: the physical interpretation of the 
quantal formalism must be based in the Jast resort upon classical 
idealizations. 

The formula (2) was encouraging insofar as it showed that the average 
field acting on a finite test-body can be accurately determined; but 
what is the relation of the field measured in this way to that produced 
by the given source distribution? The test-body itself is added to these 
sources, and in how far does the reaction of its own field influence the 
result of the measurement? This was indeed the central question, 
«hich could only be answered by a minute analysis of the behaviour 
of the test-body in the measuring process. The first thing to do was to 
devise a mode of measurement in which the perturbing influence of the 
motion of the test-body would be reduced as much as possible. Outside 
the time interval 7 a cornplete neutralization is achieved by superposing 
to the test-body another body of opposite charge, and the time-intervals 
At during which the test-body is in motion, at the beginning and end of 
the interval 7, can be made arbitrarily smal]. What remains is thus an 
essentially uncontrollable displacement D, of the test-body, constant 
during the whole time-interval T. 


It was then an easy matter to work out, at any rate to the classical 
approximation, the field produced by such a distribution of electric 
polarization of density pD, subsisting during a time T, and the momen- 
tum imparted by this field to another test-body during a given time- 
interval. Jt was highly gratifying to find that the expression for the 
Jatter quantity coincides in form with either of the terms, symmetrical 
with respect to the two space-time regions considered, whose difference 
forms the commutator of the corresponding pair of average field 
components. 


This formal relationship may be presented in full generality as 
follows: Let D(x — x’) be the Green’s function expressing the 
retarded potential of a point source. The commutation relations 
between field components may be written 


[fo(X)> fralx’)) = ihe Ay, Ax,x')D(x — x’), (3) 
with the notation 


Nae bbe ) a Dio X ~) 


7 (4,, Ox’ =<. 3x) Ox, ot Pus ax! 7 Syn I Ox, 4 


and D(x — x’) = — Dix — x) + D(x — x). (5) 


6 


74 L. ROSENFELD 


For the average field components over space-time regions R,R’, 


FAR) = | fla 6) 
R 
this becomes 
Ue, »(R), a. — ihc[A,, wa R, R’) a Aya pls R)}, (7) 
with A wa RR) = R an at atx deel) ECs x). (8) 


Now, the measurement of F,,(R) involves test-bodies of charge- 
current density p,, whose various elements p,d4x undergo displacements 
D, which, though unknown, may be made equal; there results an 
electric or magnetic polarization, uniform over the region R, of density 


Pak) Pe I d(x — x")d4x 
R 
with Po 2 pa 


This corresponds to a charge-current ae 


ape, ,| ates. me — x") 


which gives rise to a potential 


4 
es 4..f (eG 4 © a 
Ae) —— Pw {a x" DAx' — x") iL ax ox, d(x — x") 


0 
4 , 
= P| d x Ox, Die oa xX): 


The corresponding field component ¢,,(x’) may be written 
: ?) d\ a 
Bale!) = P| dx (Bsc — Bucs) 55 Dale’ 


== 1Po| ax Any e(X,X )D,(x' — x), 


and its average over the region R’ is accordingly 
D ya RR’) a $P RA wy eX RR’). (9) 


It will be noticed that the coincidence between the quantities occur- 
ring in the formulae (9) and (7) has its origin in the relation expressed 
by formula (5), between the commutator and the Green’s function. 
We shall return later to this remarkable relation which we are now 


On Quantum Electrodynamics 75 


able to understand a bit better than was then possible. At the time, 
the expressions of the type (9) found for the interactions between the 
test-bodies gave us the hope that we were on the right track; in fact, it 
seemed as if we had hit upon a mode of measurement exactly suited to 
give the best combined accuracy, for pairs of field determinations, 
compatible with the theoretical limitation. We little imagined that it 
would stil] take us almost two years to reach that goal. 


At first, all went well. We had a lively surprise when we found out 
that the only case of coupled measurements for which a reciprocal 
indeterminacy relation had been explicitly written down and discussed 
in the literature, namely that of an electric field and a perpendicular 
magnetic field in the same volume element, was one in which unlimited 
accuracy had to be expected from the correctly integrated commutation 
law. It was immediately clear that in such a case the mutual perturba- 
tions of our extended test-bodies would cancel out and that we could 
fulfil the theoretical prediction. More exciting was the realization that 
the reaction of the test-body upon itself in the course of the measure- 
ment of any single field component could be automatically compensated, 
at least so far as it can be classically computed. In fact, this reaction, 
being proportional to the displacement of the test-body, can be matched 
exactly by an elastic spring mechanism of known strength. 


I must not by-pass the problem of the actual momentum measure- 
ment of the extended test-bodies. To take account of relativistic effects, 
these bodies must be imagined to consist of a large number of inde- 
pendent elements, and it was far from obvious how the total momentum 
of all these elements could be obtained without multiplying the error 
far beyond the optimum limit which we required. Moreover, one had 
to make sure that the relativity requirements could be met without 
further restriction to the measurability of the momentum. This 
necessitated a much more detailed analysis of the measuring process 
than one was wont to in ordinary quantum mechanics. BOHR succeeded 
in showing that the measurement of the total momentum can even be 
performed in such a way that the displacements of the elements, though 
uncontrollable within a finite latitude Ax, are all equal, and that the 
determination of the total momentum is only limited by the uncertainty 
of the common displacement Ax to the extent f/Ax indicated by the 
indeterminacy relation. The interest of this result transcends its 
immediate application to the problem of field measurement: it affords 
a specially clear example of a measuring process which can be entirely 
described in a purely classical way, and in which the origin of the 
reciprocal indeterminacies is thus directly traced to the impossibility of 
specifying the dynamical characteristics of the system without loosening 


76 L. ROSENFELD 


its connection with the space-time frame of reference*. The solution of 
this problem is one of the most striking products of BouR’s uncanny 
virtuosity in this subtle kind of analysis. It cost him some hard think- 
ing, but was for him a source of intense enjoyment. 


Having satisfied ourselves of the possibility of defining with unlimited 
accuracy the space-time average of any single field-component, we 
hopefully proceeded to the discussion of a pair of such measurements. 
The investigation, however, was very soon brought to an apparent 
deadlock by a difficulty of a most baffling nature. What precipitated 
the crisis was the minus sign on the right-hand side of formula (7) for 
the commutator of the field averages. This harmless-looking feature 
revealed itself a most crucial and excruciating riddle. It was child’s 
play, with our arsenal of test-bodies and springs, to reduce the product 
of the indeterminacies AF,,,. AF,, to the sum 


hicl| A wea RoR’) | ae |A ica R’R) | IF 
But how could one ever hope to get down to the difference 
|A woea(R,R’) a A.ap(R’R)| ? 


At this juncture, Bour did not hesitate to challenge the commutation 
rules themselves. Repeatedly we went through every step of their 
derivation, on the look-out for possible ambiguities. I was much 
impressed by this sacrilegious attitude, revealing as it did the sovereign 
independence of a powerful mind and its readiness to submit to any 
well-founded conclusion, however at variance with common expectation. 
Eventually, we noticed that in the troublesome cases, i.e. those in which 
none of the two terms of the difference vanishes, so that there is a 
mutual perturbation of the two measurements, it is also possible to 
exchange light signals between the two regions R,R’. Closer examina- 
tion soon showed that the “‘messages” which could thus be automati- 
cally conveyed by the test-bodies contained just enough information 
about their respective displacements to conjure up the difference 
predicted by the theory. It is very striking indeed to see how the greatest 
accuracy compatible with the commutation laws can only be achieved 
by exploiting to the utmost the possibilities, afforded by the physical 
situation, of controlling the course of the measuring process. 


The realization of such a complete harmony between the formalism 
and its physical interpretation was felt as a fitting reward for our 
tribulations, and with great alacrity BOHR presented our conclusions 


* It has apparently, like so many things, escaped the attention of those quixotic young 
physicists who, spell-bound by distorted echoes of a lore unfathomed, try to split the 
quantum with the rusty sword of mechanistic materialism. 


On Quantum Electrodynamics Td 


and handed in the manuscript of our paper at a meeting of the Danish 
Academy on the 2nd December, 1932. The reading of the fourteen or so 
successive proofs only took about one more year. One point especially 
still gave us much trouble to the very last: what part has one to assign 
to the field fluctuations in the logical structure of the theory? This is 
a quite fundamental question, which I have kept to the last for some 
comment. 


The self-reactions and mutual perturbations of the test-bodies can 
only be compensated by spring mechanisms to the extent of their 
classically evaluated average magnitudes. Their actual values, however, 
owing to the quantal character of the electromagnetic field, deviate from 
the classical averages to an extent which necessarily escapes our control: 
for such a control would involve a determination of the actual numbers 
of photons emitted in the various modes of oscillation and accordingly 
entail the loss of our knowledge of the phase relationships between 
these modes. Such statistical fluctuations accompanying any field 
defined by classical sources have, however, a universal character: in 
fact, they are just the zero-field fluctuations. This may be seen in the 
following way: 


The radiation part of any field may be represented, in the usual way, 


in the form 
F= CAP + aff), (10) 


in terms of the annihilation and creation amplitudes a;, a! belonging 
to the different modes labelled by the index 7. We may take formula (10) 
to represent the space-time average of some field component over a 
given space-time region; the f;, f* will then be the corresponding 
space-time integrals over the appropriate progressive phase functions. 
If the field is defined by a classical source distribution, the expectation 
values <a;>, <at> are given in modulus and phase: 


<a, = Nie, <al> = N}Pe—m; (11) 


in this notation, N; represents the average number of photons of 
mode i. 


Now, we can readily set up the expression for the state-vector 
describing this situation. In fact, since the emission processes which 
build up the radiation field are statistically independent events, the 
actual numbers N, of photons in the different modes are distributed 
around the averages N; according to the Poisson formula 


P(N,) = (N,N Ne (12) 


78 L. ROSENFELD 
and the state-vector is of the form 
WON, Ns « <) = lay, 
(NN ) P(N;) (13) 
with |D(N,) 2— p(N,). 


In order to obtain the expressions (11) for the average amplitudes, it 
suffices to fix the phases of the factors ¢(N,) by 


P(N,) = |AN, sete, (14) 


Using the state-vector (13), (14), a straightforward calculation shows 
that the mean square fluctuation of F, 


CI 


reduces exactly to the expression >| f, 
and ¥,’s. 


2, quite independent of the N,’s 


The Poisson distribution of photons has lately been established by 
very elegant calculations utilizing the new methods of quantum electro- 
dynamics [4]. In those days, however, peopie were not so learned. 
Judging from the surprise which the formula (12) excited in those who 
saw it, I think BouR and, no doubt, PAULI must have been the only ones 
who knew that this property is an inherent part of the photon idea. 


The property just discussed of the quantal field fluctuations gives the 
clue to the assessment of their significance for the consistency of the 
theory. In fact, it becomes clear that the fluctuations affecting the fields 
produced by the test-bodies during the measurements cannot be 
separated from those which equally affect the field which we want to 
measure. The impossibility of compensating or controlling them in any 
way, far from being an imperfection of the measuring device, is a pro- 
perty which it must necessarily possess to ensure that all the conse- 
quences of the theory are in principle verifiable by measurement. The 
fact that the zero-field fluctuations are superposed on to the classical 
field distribution is indeed a well-defined theoretical prediction, and we 
see that we are able to suppress the perturbations arising from the 
manipulation of the test-bodies to an extent which just leaves scope for 
the test of this prediction. 


Now the existence of such fluctuations certainly means that the 
causal connection of classical theory between the fields and their 
sources is lost in quantum theory: this is just one aspect of the break- 
down of determinism and its replacement by a wider form of statistical 
causality. This issue, however, should not be confused with the 
question whether the field concept can be consistently upheld in the 
quantal mode of description. Our analysis of field measurability shows 


On Quantum Electrodynamics 79 


that any measurement of a field average will always give a definite 
answer, which will be reproducible (this means that measurements of 
the same field component over two almost coinciding space-time 
regions will yield the same answer): but the result of any such measure- 
ment may differ by any amount from its classical estimate. The break- 
down of the classical connection between field and source is a feature 
of the quantum theory of fields just as fundamental as the reciprocal 
limitation of measurability of field components; both are in fact direct 
and formally quite parallel consequences of the commutation laws 
which express the quantization of the field. 


The classical solution of the field equations, expressed in terms of 
the retarded potentials corresponding to the given charge and current 
distribution, does not satisfy the commutation rules. Owing to the 
linear character of the equations, however, it can always be supple- 
mented by a solution F, of the type (10) of the homogeneous equations, 
which represents a pure radiation field due to far-away sources not 
included in the system. If there is no such external source at infinity, 
the radiation field F, would in classical theory reduce to zero, but in 
quantum theory it does not vanish identically. The condition ala, = 0 
expressing that the field does not contain any photon, implies that the 
expectation value of each component of F,, vanishes everywhere, but 
any space-time average of the field component will exhibit fluctuations 
with a finite quadratic mean 


(Foy = DI fil? 


This leads us again to our previous conclusion, but the present argu- 
ment shows how essentially it is linked to the characteristic non- 
commutability of the quantal variables. Indeed, one may say that the 
existence of the zero-field immediately follows from the additional 
requirement imposed on the solutions of the field equations by the 
commutation laws. 


At the end of our laborious inquiry, we had thus completely vindi- 
cated the consistency of quantum electrodynamics, at least in its 
simplest form. Our increased insight invited a re-assessment of the 
scope of the analysis we had just completed. We had set out on the 
suspicion of grave defects in the logical structure of the theory, and 
used the direct method of testing definitions of concepts by investi- 
gating the concrete measuring processes they embody. Knowing, 
however, that the formalism is free from logical flaws and that the 
physical interpretation of its symbols is in strict accordance with their 
relation to observable quantities, it becomes tautological to assert the 
possibility of constructing measuring devices capable of verifying all 


80 L. ROSENFELD 


theoretical predictions. All that the actual design of such instruments 
can do is to illustrate the logical relationships and, perhaps, elucidate 
their meaning. 


Measurability of charge and current densities 


This somewhat sobering reflection did not deter us, however, from 
carrying on the investigation a step further. HEISENBERG, who was 
actively exploring the possible repercussions of the explicit consideration 
of space-time averages in quantum electrodynamics, came across 
remarkable properties of charge fluctuations [5], the interpretation of 
which was a strong incentive for us to extend the analysis of measur- 
ability to the charged fields describing the sources of electromagnetic 
radiation. Apart from this immediate motive, however, such an 
analysis had an interest of its own as a further illustration of the point 
of view of complementarity, with its wider implications for general 
human problems, never absent from BouR’s mind even when he seems 
exclusively engrossed with abstract points of epistemology or the 
ingenious technicalities of measuring contraptions. More deeply than 
anyone else PAULI grasped the earnestness of BOHR’s endeavour. His 
criticism—pointed in every sense of the word—had all the time sustained 
our exertions and he was now egging us on to continue. 


The problem now requiring attention was the use of the classical 
idealization of charge and current density even when this quantity is so 
large that the quantal properties of the charged fields must be taken 
into account. To take a simple case, consider the average charge 
contained within a given volume V during a given time-interval T. It 
is measured by the flux of electric displacement issuing from the 
volume V during the time 7. We are well-equipped to measure this 
flux. We surround the volume V with a “‘shell’’ of test-bodies and by 
appropriate lever transmissions we transfer upon a single auxiliary 
body the sum of the normal components of momentum of all the test- 
bodies; the measurement of the flux is thus essentially reduced to the 
same manipulations of the test-bodies as that of a single field com- 
ponent. It is easily seen that in this case all electromagnetic interactions 
of the test-bodies exactly cancel out, including even the net effect of the 
zero-field, which obviously has no flux through a closed surface. 
Moreover, the electromagnetic field produced by the shell of test-bodies 
is entirely confined within the shell, and measurements of average 
charge densities over two different space-time regions can therefore be 
performed without any reciprocal limitation of accuracy on this count. 
These rather obvious results provide the physical justification for the 
stagewise treatment we have adopted, in the first approximation of 
which the charge-current densities are treated as classical quantities; 


On Quantum Electrodynamics 81 


formally, of course, the consistency of this approximation is guaranteed 
by the fact that the field divergencies df,,/0x, commute with the field 
components. 


The real problem starts at the next stage of approximation, in which 
effects linear with respect to the fine structure constant are considered. 
The new fundamental feature which then makes its appearance is the 
phenomenon of electron pair production, or more generally, the 
production of a “pair field” by the electromagnetic field*. There is 
also a modification of the expression for the commutators of electro- 
magnetic field components, but this is a generally small correction 
which does not essentially affect the conclusions reached in the initial 
stage of approximation.t 


The part played by the pair field in the problem of the measurability 
of charge-current densities is an almost identical pendant to that of the 
photon field in the measurability of electromagnetic field averages. 
The manipulation of the test-bodies produces inside the shell an 
electromagnetic field, which is the source of a pair field manifesting itself 
as a polarization of the surrounding ‘‘vacuum”’ as well as the appear- 
ance of actual pairs of electrons of opposite signs. The expectation 
value of this pair field can be calculated and since it is proportional to 
the displacement of the test-bodies, its effect on single density measure- 
ments can be completely compensated, while the mutual disturbance 
of a pair of measurements can be reduced to the optimum limit derived 
from the commutation relations for charge-current densities. The 
fluctuating part of the pair field, which cannot be compensated, is again 
an inherent property of every charge-current density, in quite the same 
sense as the electromagnetic zero-field. 


There is no need to enlarge further upon this parallelism; I could do 
no more, within the compass of this article, than transcribe the more 
precise, but still very condensed account which is given in our second 
paper [7]. However, I might just illustrate it from the more formal side 
by exposing the skeleton of the argument, in close similarity with the 
electromagnetic case. The polarization current J,(x’) produced by an 
electromagnetic field of potential A,(x) can be expressed by means of a 
Green’s function of tensor character 5, ,(x’,x): 


Dict) == NOU eA den. (15) 


* For the sake of clarity I confine myself here to the consideration of electrons (or at 
any rate fermions) as the carriers of electric charge and current. Later on I shall have 
occasion to refer briefly to the problems raised by charged bosons. 

+ This point has been investigated by E. CortNALDEs! in his (unpublished) Manchester 
thesis. Cf. his survey article [6], p. 93-95. 


82 L. ROSENFELD 


If the commutator between charge-current densities j,(x) is written in 
the form 

[is] = thicK g(x"), (16) 
we have the relation 


Kyy(X,X') = — By (X,x') + ,,(X',x); (17) 


of the same general type as that expressed by formula (5). For space- 
time averages 


1 ; 
J,{R) = R [ coe (18) 
we get [F,AR), I(R)] = ihe[B,,(R,R’) — B,( RR], (19) 
. t 1 , , 
with Bi RR!) = RR’ [ ave| ats DEAN Xx): (20) 


Now, we must evaluate the potential A,(x) arising from the manipu- 
lation of the test-bodies involved in the measurement of J,(R). If we 
idealize the shell of test-bodies by a surface distribution of constant 
charge-current density p, over the boundary S of the region R, to be 
subjected to a constant normal displacement of magnitude D, we 
obtain a polarization density 


P,{(x") = P,, [o" — X)do,, 


with P=. 


(We denote by X the point of integration on the boundary S.) The 
corresponding vector potential A,(x”) is 


3 
A(x") = — P, | ds’ Dx" — x") ox? [oe — X)do, 
= P I mon D a = de Eee P D iu d4 
Sd i Ee Ox, x > x) = Lb ao Mx aml x) x 


= P,| a(x" — x)d4x, . (21) 


giving rise to an electromagnetic field which vanishes everywhere 
except on the boundary S. According to (15), the polarization current 
at x’ due to the field (21) is 


Le Pe) byl ada 


and its average over the region R’ is 
L(R,R’) = P,,RB,(R,R’). (22) 


On Quantum Electrodynamics 83 


The analysis of the measurability of charge-current densities thus 
confirms and extends our previous conclusions concerning the logical 
consistency of quantum electrodynamics. The consideration of the 
quantal fluctuations of these quantities, however, raises a quite new 
issue, which has a bearing on the entirely distinct problem of the 
domain of validity of the theory for the description of the physical 
phenomena. This issue arises from the fact that with the introduction 
of particles of finite mass, we must face the existence of an absolute 
space-time scale, and the restrictions this may impose upon the con- 
struction and manipulation of our test-bodies. The behaviour of the 
characteristic pair-field fluctuations warns us that when we try to test 
theoretical predictions pertaining to regions of very small dimensions 
we move on dangerous ground. In fact, HEISENBERG’s investigation 
reveals that—in striking contrast with the fluctuations of the electro- 
magnetic field—the charge and current fluctuations within any sharply 
limited space-time region are infinite. It is only by considering the 
average fluctuation over an ensemble of space-time regions whose 
boundaries are contained in a shell of finite thickness that one gets a 
finite value, which, of course, tends to infinity as the thickness of the 
shell vanishes. 


The physical meaning of this result directly follows from the analysis 
of charge and current measurements. The fluctuations of the charge or 
current inside a given space-time region are inseparable from those of 
the pair-field produced by the shell of test-bodies in the course of the 
measuring process. We see that if we want to increase the sharpness of 
definition of the space-time region by making the shell thinner, we pay 
for this by a corresponding increase in the fluctuation of the charge or 
current enclosed in the region. The existence of these fluctuations by 
no means impairs the possibility of definition of the charge-current 
density; it is a necessary consequence of the quantization of the charged 
field. From the practical point of view, however, it is clear that if the 
fluctuation of a quantity becomes of the same order as its expectation 
value, or larger, the knowledge of the latter, however well defined, 
becomes less and less significant. In this sense, the study of the increase 
of the charge and current fluctuations with decreasing dimensions of 
the regions in which these quantities are ascertained points to a degree 
of smallness of these dimensions beyond which the information 
conveyed by the theory ceases to be useful in guiding our expectations. 


We are anxious to test those predictions of quantum electrodynamics 
which exhibit typically relativistic features; the domain in which such 
effects become prominent have dimensions smaller than the Compton 
wave-length (divided by 27) 4 = h/mc of the charged particles. The 


84 L. ROSENFELD 


shells of test-bodies must accordingly have a thickness b considerably 
smaller than /. In that case, it is found [8] that the root mean square 
fluctuation of the charge contained in a volume of linear dimensions L 
and averaged over a time of the order L/c is approximately given, in 
units of the electron charge, by [log (L/b)]"*. This logarithmic depen- 
dence of the charge fluctuation on the linear dimensions 5 of the test- 
bodies composing the shell is a very favourable circumstance*: it 
allows us to go down to values of the ratio b/L of the order of a few 
per cent without unduly increasing the fluctuation. Even so, if L is 
smaller than the Compton wave-length of the electron, this brings b 
down to nuclear dimensions. It is still conceivable that our measuring 
apparatus could be built up of nucleons, or perhaps hyperons (they 
have to last only for times of tne order A/c), and we thus arrive at the 
satisfactory conclusion that the predictions of relativistic electron 
theory are in principle susceptible to verification; but we are obviously 
on the brink of the abyss: it is questionable, for instance, whether the 
same theory is at all meaningful for nucleons. On general grounds, 
nuclear dimensions would seem to mark the limit of validity of the kind 
of idealization which makes up the conceptual framework of electro- 
dynamics. Beyond this limit, strongly interacting fields, with a limited 
range of force transmission, call for a radically different mode of des- 
cription, in which the idealization of point charge presumably has no 
room. 


The formal structure of quantum electrodynamics 


It is necessary to emphasize again, in view of the confusion prevailing 
in certain circles, that the last remarks, while directly suggested by our 
discussion, do not at all imply that this discussion leads to the disclosure 
of the insufficiency of quantum electrodynamics. The limits of validity 
of the theory can only be derived from physical arguments, quite 
distinct from the logical analysis of its internal consistency. This 
analysis shows in fact that, whatever shape any more comprehensive 
theory will take in future, the present theory does give a coherent 
description of electromagnetic fields and electrons within a domain 
amply covering all practical applications. 

The steady progress of the last few years in the unravelling of the 
formal structure of quantum electrodynamics leads to the same 
conclusions. In particular, it is now widely realized that the mathema- 
tical divergencies, which were such a disturbing feature so long as they 
could not be isolated and identified in a covariant way, are not due to 


* The fluctuations of current components defined in a similar way show a much worse 
behaviour [9]; they actually increase as (L/6)!/. These quantities, however, have no such 
direct physical meaning as the charge fluctuations. 


On Quantum Electrodynamics 85 


any inherent inconsistency of the formalism, but only arise from a 
failure to take account of the non-electromagnetic elements which come 
into play in domains of nuclear dimensions. 


In this respect, the situation was in no way different in the classical 
electron theory. Critical conditions under which the classical picture 
of the electron as a moving point charge breaks down, are reached when 
radiative reactions become of the same order as static interactions; the 
size of the critical domains, given by the “‘classical electron radius” 
e*/mc®, coincides with nuclear dimensions—owing, no doubt, to some 
unknown, deep-lying law of nature. It is impossible, however, as was 
stressed by POINCARE, to account for the cohesion of an extended 
electron without appealing to forces of extraneous origin. 


One often speaks in this connection of ‘‘open”’ theories, as opposed, 
I imagine, to some sort of ideal theory which would suffer no restriction 
of the range of variation of the physical quantities involved, and would 
be completely self-contained. It seems to me, however, that this 
opposition is quite artificial; so far as I can see, all physical theories 
have always been of the so-called “‘open”’ type, and a “‘closed”’ theory is 
just as much a will o’ the wisp as universal determinism. 


Bour certainly never showed any respect for the noble elegance of 
a Lagrangian principle. In the stalemate of the first period of develop- 
ment of quantum electrodynamics, he repeatedly stressed that our only 
guidance in judging the adequacy of the formalism remains the corres- 
pondence with classical theory, that sesame which in more heroic days 
had served him so well. One aspect of the “openness” of the formalism 
is the divergence [10] of the series expansions in powers of the fine 
structure constant «; however, owing to the dimensionless character 
of this constant and its numerical smallness, the series (regarded as 
functions of «) are of the asymptotic type, and the successive terms 
are decreasing up to powers of « much larger than are ever likely to enter 
into the most refined experiments. The correspondence argument then 
assures us that such asymptotic series will not meet with any difficulty 
of interpretation. 


From this correspondence point of view, the condensation of the 
contents of the theory into a variational principle may in a sense appear 
deceptive. A remarkable feature of the variational approach, however, 
is its ability to bring out the fundamental laws in integral form. It is 
unlikely (to say the least) that this form could help to circumvent the 
divergence of the usual procedure of piecemeal construction of solutions 
according to the differential] mode of analysis. But it casts interesting 
light on certain aspects of the harmony between formalism and physical 
reality. 


86 L. ROSENFELD 


It is clear that in an integral formulation a central part will be played 
by the Green’s functions describing the propagation of the fields in 
space-time. These propagators are not limited, of course, to the simple 
forms D,(x’ — x), b,,(x’, x) already considered, but form an extended 
class of functions involving connections between increasing numbers of 
space-time points. SCHWINGER [11] has shown that the starting point 
for a theory of propagators is a Lagrangian principle in which the 
interaction of the quantized fields with external sources, treated 
classically, is explicitly taken into consideration. It is in fact by sub- 
jecting the classical source densities to arbitrary variations that the 
fundamental integral equations satisfied by the propagators are 
derived*. Moreover, by selecting special variations of the external 
sources, according to a methoc independently proposed by Perercs [12] 
(in a somewhat different context), it is possible to derive general 
expressions for the commutators or anti-commutators between the 
field variables in terms of appropriate propagators. The discussion in 
the preceding sections may have prepared us to appreciate the signifi- 
cance of this introduction of classical sources into the very foundations 
of the formalism: in fact, this step will appear quite natural when it is 
realized how essentially all physical interpretation of the formalism is 
based on reference to classical quantities in the sense of the corres- 
pondence argument. There is truly no occasion here to speak (as 
SCHWINGER does) of ‘“‘fictitious’’ sources; on the contrary, such an 
idealistic terminology conveys a topsyturvy idea of the situation and 
fails to do justice to the practical utility of a classical descripticn of the 
sources in many problems of radiation theory. 


Let L, be the usual Lagrangian density referring to a system of an 
electromagnetic field A,(x) and a pair field p(x), #(x), coupled together 
with a strength of as yet undetermined magnitude e?/fc. Let us now 
consider a system of external sources for these two fields, defined by a 
charge-current density J,,(x) and a potential «,(x), respectively. If we put 


AO) =F POD), M)=SayOIyW), 23) 


we may write the total Lagrangian density in the form L = L, — H, 
with 

H = — PPA (X)IAx) — BTCDYR) + PCN); (24) 
in this formula there appear two undetermined constants, 8, 4, to whose 
interpretation we shall return in a moment. Let Y’, be a system of 


* The line of investigation initiated by Ferretti [13] and pursued by CatANieLLo [14], 
in which variations of the coupling parameter are considered, falls under the same scheme. 


On Quantum Electrodynamics 87 


eigenvectors of the Hamiltonian in the absence of external sources; the 
“vacuum” state ‘’, is defined as that corresponding to the least eigen- 
value. We may describe* the effect of the external sources by a canonical 
transformation of these eigenvectors, 


Foy) = UOy,02|J)¥ (2), _ @5) 
defined, with respect to an arbitrary family of space-like surfaces 
o(x) = o, by the equation 

6U(G,59|nJ) 

6a(x) 


= H(x)U(o,0,|nJ). (26) 


Every operator F(x) pertaining to the source-free system will thus be 
transformed, in presence of the external sources, into the operator 


eye UU (27) 
Consider, in particular, the effect of a point source 
J Ax) = 80(x — x’) by 


of the electromagnetic field upon the potential A,(x). On the one hand, 
we have by definition of the Green’s function D(x — x’), 


A®@)(x) = A(x) + €6,,D,(x — x’), 


on the other, by (27), A@(x) = U-1A,(x)U. The transformation U is the 
solution of (26) in which (putting 0 = 1) H = ¢A,(x')d(x — x’), i.e. 


Uo) = 1 foro=a’, 
Ue) = 1 + (ihe)eA,(x’)U(o’) for o > a’. 
By assuming ¢ sufficiently small, we may use the approximate expression 
U(c) + 1 + (ihe) 1eA,(x’) for o > 0’, 
which gives A(x) w A(x) + (ihc)e[A,(x),4,(X’))]. 
The comparison of the two expressions for A‘(x) yields 
[A,,(),4,(x)] = ifcd,,D,(x — x’) for a > 0’. 


By interchanging the roles of x and x’, one finds for the commutator the 
value — ihco,,D,(x’ — x)in the case o’ > a, and therefore, for arbitrary 
xand x’, 

[A,(x),A,(x')] = — ihed,,D(x — x’), (28) 


* T adopt the elegant mode of exposition of the theory proposed by UMEZAWA and 
Visconti [15]. 


88 . L. RosenreLp 


with the definition (5) of D(x — x’). This is an example of PEIERLS’ 
general argument [12] establishing the close relationship between 
commutator and Green’s function, which we found of fundamental 
significance in testing the consistency of the interpretation of the 
formalism. The formula (16) for the commutator of current densities 
can be derived by an entirely similar argument*. 


We have seen that an equally fundamental part is played by the 
“vacuum” fluctuations of the field variables, or more generally by the 
correlation functions | 


14,0):4,0 >, CiC0f0O]L> 


and their derivatives, integrated with respect to x and x’ over the same 
space-time region R; the notation <F> represents the “‘vacuum” 
expectation value of the operator F (in absence of external sources). 
These correlation functions are connected with the corresponding 
commutators by a quantitative relation, most simply expressed in terms 
of Fourier transforms. If, e.g. the Fourier components of wave vector 


k of 
([4,(%),4,%)],> and [4,(x),4,(%')] 
are denoted by Aj,(k), A),(k), it is easily seen that 
Aik) = &(k)Ajy(k), _ 29) 


where e(k) = + 1 according as the time component k, is positive or 
negative; this follows from the definition of the vacuum state as the 
state of minimum energy. Only a slight extension of this argument is 
needed to establish a similar relation in the case of the current densities. 
An interesting consequence of equation (29) is to show that the order of 
magnitude of the quadratic field or current fluctuation is a critical one, 
not only for the deviation of field or current averages from their 
classical determinations, but also for the reciprocal indeterminacy of 
pairs of such field or current averages. More precisely, let us consider 
two congruent space-time regions consisting of a volume and a time 
interval, displaced with respect to each other by a space-time vector 
whose components are of the same order as the corresponding linear 


* If we apply the same reasoning directly to the fermion field variable w(x), by con- 
sidering a source density of the form 7(x’)6(x — x’), we get the relation 


yx), po Inx)] = iiceS(x — xDy(x), 


where S(x — x’) is the corresponding propagator. Assuming, in harmony with (23), that 
the source strength 7(x’) anticommutes with w(x), we arrive at the fundamental relation of 


anticommutation he 
[y), p14. = iheS(x = oy, 


On Quantum Electrodynamics 89 


dimensions of the regions. The absolute value of the commutator of 
two field or current components averaged over these regions gives an 
estimate of the largest reciprocal limitation of measurability to be 
expected. With the help of relation (29), it can be shown [6] that this 
quantity is always of the same order as the quadratic fluctuation of a 
field or current component within the same region (assuming a boundary 
of finite width in the latter case)*. 

Commutators and correlation functions appear in a somewhat 
different relation as parts of the special propagators introduced by 
FEYNMAN, which embody past and future in a symmetrical way. The 
Feynman propagators may be defined by the following formulae: 


Dire’) = CP = me | (30) 
KP OX’) = PL AMP), 
in which the notation of the chronological product 
PLF(X)G%)] = SLFO),GX)], + ee 
(31) 
with é(0,0°) = + 1 according as o is > or <0’, 


has been used. The Feynman propagators, from which the quantities 
of direct physical interpretation, such as the Green’s functions, and 
transition probabilities, can readily be derived, are conveniently chosen 
as the fundamental quantities in terms of which the integral field theory 
is formulated. More precisely, one has of course to consider the whole 
manifold of propagators of the general form 


CPlyy) - - . YO mPOD) - - » POnAula) . - + AnladD. G2) 


The remarkable result of UMEZAWA and VISCONTI’s analysis [15] is 
that all these propagators are contained in the transformation U which 
expresses, according to the equation (26), the effect of external sources 
on the system; the integral equations by which they are mutually 
connected are accordingly contained in this equation. We saw how the 
fundamental commutation relations appear as consequences of intro- 
ducing weak external sources of a special form. To obtain the pro- 
pagators one has to examine the effect of the most general infinitesimal 


variation of the external sources. It is easily seen that, e.g., 
6?U(6,,02|7J) : 

eee ee 7 /2)2 , , 
OF,(z)0F,(z') caM"y’ P[UO,,0|nJ)A,(Z) U(o,0 \nJ )A,(2'‘)U(e 09 |nJ )1 


* Our paper [2] on field measurability must be amended on this point; the correction 
does not affect the argument. 


i 


90 L. ROsENFELD 
whence 
0° U(G,,05 | nJ) 
D,,(z,z' |nJ) = ¥t(0,) : 
wlZ,2'|9 AOU) “OF (2)0I,(z’) 


= (101?) . P[VY(0)A,(2)U(o,0" |nJ) 4,2) (0')], 


Y,(o2) 


and therefore 
DP(z,z') = — D,,(zZ,z' 00), with? = 1. (33) 


Quite generally, the transformation U(o,,0,|nJ), regarded as a functional 
of the external source densities 7, J, is a generating functional of the 
propagators; and the equation (26) gives rise, in the same sense, to a 
simple system of generating equations for an infinite set of integral 
equations linking together propagators of successive orders. 

We shall not go further into the details of this beautiful formal 
scheme, but only briefly point to some of its most salient features. In 
the first place, it is noteworthy that all problems of “renormalization”’ 
arising from the upholding of the point charge and local interaction 
idealizations are confined to the first stage of the theory, i.e. the 
determination of the elementary propagators involving just one 
photon or one electron*. The equations relating higher order pro- 
pagators to the elementary ones do not involve any further divergence. 


To illustrate the renormalization problem, it will suffice to consider 
the one-electron propagator (whose definition is slightly modified by 
the adjunction of a normalization factor) 


PLY (0) p(x) U(o,0" (OF) 5 ¥(0')) 


(N) ’ — (791/2)2 
G?) (soe (07) = GA) Fi(a)O(0,0" OIE Ko) (34) 
It satisfies the integro-differential equation 
; r) ie mc , 
l EB ox, = = VA (x) + ad G(x, x" |0J) 
+ © fM(x£|ON)GME,x’|OD)d4E = 26(x — 2’). (35) 


In this formula, .%,(x) represents the classical value of the vector 
potential due to the external current distribution J,,, while the charge 
parameter has been replaced by the renormalized value e, = e0-/*. 
The constant 9, which thus effects the charge renormalization, repre- 
sents the undetermined polarizability of the “vacuum,” whose dielectric 
constant is 6-' and magnetic permeability is c?6: its role is merely to 
make the adjustment of the electromagnetic units rendered necessary 


* The propagator KI ’ in (30) can be reduced to the simpler one <P[y(x)p(x)). 


On Quantum Electrodynamics 91 


by the introduction of the pair field. As regards the “‘mass-operator’” 
M(x,é|OJ), which is defined in terms of the propagators and the 
“vertex operator” 

d[Gx,x’ |0J)}- 
Ge 0 ec aGSeaE) 


C0 iz) 


? (36) 


it embodies the modification of the undetermined mass parameter m of 
the electron due to its coupling with the electromagnetic field: if m, 
denotes the observed mass, one finds that 


V4 JM(x,E|00)GOE,x’ 00) d#E = (m, — m)GO(x,x’ |00) 
fora >o. (37) 


It is apparent from equation (35) that the propagator for a free electron 
will be affected by the renormalization factor Au-!, where yw can be 
expressed as 


P) 
b= [m + M(p)] (for y,p, -+ m, = 0) (38) 
1 


in terms of the Fourier transform M(p) of the mass operator M(x,é|00). 
The reduction to the usual space-time units is thus effected by taking 
A = mw. The constant A, which affects the particle density* py, serves to 
adjust the spatial dimensions to the value of the Compton wave-length 
h/m,c fixed by the mass value m,. 


While it is gratifying to find that the renormalization problem can 
thus be circumscribed to the definition of the elementary propagators, 
and does not arise with the higher order ones, there is another feature 
in the treatment of the latter which is rather disturbing. As we have 
just seen, the elementary propagators and the renormalization para- 
meters are susceptible to a direct and quite transparent physical 
interpretation. This is not the case with expressions of the general form 
(32); such quantities can only be related to transition probabilities of 
actual physical processes if the variables x, x’, z consist of two sets 
2 ep), (Xi ncets cy Nas Zpgis 2s « 2p) DElOngingeLS 
two space-like surfaces o, o’, respectively. The integral equations for 
these higher order propagators, however, do not allow of a restriction 
of this type in the range of variation of the space-time coordinates: 
they make the determination of the transition probabilities dependent 
upon that of purely mathematical symbols which cannot correspond 
to any well-defined physical situation. This would not in itself be 


* The factor A does not occur with the charge-current density, because of a compensa- 
tion of divergences, known as ‘“Ward’s identity,” which is a consequence of the gauge- 
invariance of electromagnetic theory. 


92 L. ROSENFELD 


objectionable, provided that it could be shown that the equations in 
question do have solutions of the required character. The mathema- 
tical problem thus raised is one of great difficulty, especially in view of 
the fact that such solutions must not only refer to the continuous 
manifold of states corresponding to scattering processes, but also 
include the bound states of many-particle systems (e.g. the “posi- 
tonium’’). Although the issue is very much in doubt, it is at any rate 
a merit of the integral formulation of the theory to present it in such a 
clear-cut way. 


From the point of view of the correspondence argument, however, 
one might well regard any rigoristic conception of this problem as 
superfluous, or even ill-founded: one would, of course, attack it by 
introducing series expansions in powers of the coupling constant «; 
and even though these expansions would not converge, one could 
attach a well-defined meaning to those few terms which could actually 
be computed. But the justification for this attitude is limited to the 
case of electromagnetic interactions, and the issue becomes most acute 
as soon as we try to apply the methods of quantum electrodynamics to 
the description of the couplings between other fields. 


Meson couplings 


HEISENBERG was the first to point out, as long ago as 1936, that the 
behaviour of interacting fields should be essentially different according 
as the coupling strength could be expressed by a dimensionless constant, 
as in electrodynamics, or had the dimension of some positive power of a 
length*; in the latter case, one expects effects of multiple interaction to 
become predominant at high energy. The formalism of quantum 
electrodynamics cannot unambiguously be extended to couplings of 
this type, since it would involve an infinity of renormalization 
parameters [16]. 


The coupling of charged bosons with the electromagnetic field is 
of the first or second type according as the spin of the boson is zero or 
larger than zero. The explicit calculation of the charge-current 
fluctuations for bosons of spins 0 and 1, performed by CorINALDES! [6, 
8, 9], strikingly illustrates the radical difference between the two cases: 
charged bosons of spin 0 behave exactly as charged fermions, whereas 
the charge fluctuations of spin-one bosons within a space-time region 
of unsharp boundary exhibit a much stronger divergence than those of 
fermions as the thickness of the boundary layer tends to zero. This 


* The case of a coupling strength with the dimension of a negative power of a length, 
of which no physical example is known, is similar to that of a dimensionless coupling 
constant. 


On Quantum Electrodynamics 93 


makes it extremely doubtful whether the point charge idealization can 
be applied at all usefully to such particles. 


When we come to consider the coupling between nucleons and 
m-mesons, however, we are in a quandary. We are in fact faced with 
two distinct possibilities, the pseudoscalar coupling which is of the first 
type, and the pseudovector coupling which is of the second, without as 
yet sufficiently convincing clues from observation as to their relative 
importance. That the interaction of the nucleons with the pseudoscalar 
mesons is commonly treated as a pseudoscalar coupling—in what very 
hurried or very lazy people call the “‘ps-ps”’ theory—is more an illustra- 
tion of human weakness than of human reason. It is true that the 
““ps-ps”’ theory has booked a certain amount of success, but this is not 
conclusive, since the two forms of coupling become undistinguishable 
at low energies and we have no reliable estimate of the extent to which 
the pseudovector coupling would affect high-energy phenomena. 


I shall only call attention to one of the problems recently discussed 
in this context, because of its relation to electrodynamics: the mass- 
difference between charged and neutral particles of the nucleon and 
m-meson species. It is tempting to interpret such mass-differences as 
purely electromagnetic effects, but the computation of the electro- 
magnetic self-energies, involving the “regularization” of divergent 
integrals, introduces an arbitrariness which would seem to frustrate 
any attempt at substantiating this interpretation. However, PETER- 
MANN [17] has succeeded to show, by an ingenious argument to a large 
extent independent of the regularization arbitrariness, that the signs 
and orders of magnitude of the two mass differences can indeed be 
accounted for on a purely electromagnetic basis. This interesting 
conclusion, which, of course, does not preclude the existence of other 
contributions to the mass-differences, is not likely to be affected by the 
uncertainty regarding the meson-nucleon coupling. 


Even if we adopt the pseudoscalar form of meson-nucleon interac- 
tion, we are faced with a situation differing from that of electro- 
dynamics by another important circumstance: far from being a small 
quantity, like «, the dimensionless coupling constant g of meson theory 
is certainly larger than unity, probably of the order of 10. Power series 
in g are no longer asymptotic, but diverge from the start, and another 
approach is needed to solve the fundamental equations of the theory. A 
most interesting step in this direction has been taken by Epwarpbs [18]. 
He sets up for the vertex operator (36) of quantum electrodynamics 
an approximate integral equation obtained by expanding all other 
quantities in power series and retaining only the lowest terms, and he 
proceeds to solve this equation exactly. The corresponding equation 


94 L. ROSENFELD 


in meson theory would be quite similar. It is true that, as the result 
shows, this procedure is still too rough even for the clectromagnetic 
case, but even so it reveals a remarkable property, which certainly 
subsists when the approximation is refined. 


The problem is eventually reduced to the consideration of a linear 
differential equation, whose indicial equation at infinity involves the 
coupling constant. The precise form of this equation is immaterial; 
the essential point is that the reality character of the roots, and with it 
the analytical properties of the solution, change when the coupling 
constant passes a certain critical value. This feature establishes a 
sharp distinction between the weak clectromagnetic coupling 
on the one hand and the strong pseudoscalar coupling of meson 
theory on the other. By approaching the problem from the strong 
coupling side and suitably extending EPpwarps’ method, Prrrr- 
MANN [19] has recently been able to elucidate the essential features of 
the anomalous magnetic moments of the nucleons, hitherto so obscure: 
it appears from his analysis that the nucleonic contribution to the 
magnetic moment ts considerably reduced by a damping effect, whereas 
the mesic contribution, which is not subject to such a reduction, yields 
equal and opposite moments for the proton and the neutron. It would 
seem, therefore, that the study of the meson ficld Icads us into quite 
unfamiliar regions, in which progress is not only hampered by mathe- 
matical difficulties, but still more by the absence of any physical 
principle which could fulfil a part comparable to that played in 
quantum electrodynamics by Boutr’s correspondence argument. 


REFERENCES 


[1] L. LANDAu and R. Peierts; Z. Physik 69, 56, 1931 
[2] N. Bour and L. Rosenretp; Dan. Mat.-fys. Medd. 12, No. 8, 1933 
See also B. Ferretti, Nuovo Cimento 12, 558, 1954 
[3] J. H. Fasre; Souvenirs entomologiques, YX, Chapter XXIII 
[4] W. TuirRING and B. TouscHex; Phil, Mag. 62, 244, 1951 
R. GLAuBeER; Phys. Rev. 84, 395, 1951 
H. Umezawa, Y. TAKAHASHI and S. KAMEFuCHI; Phys. Rev. 85, 505, 1952 
J. SCHWINGER; Phys. Rev. 91, 728, 1953 
[5] W. HEISENBERG; Leipziger Ber. 86, 317, 1934 
[6] E. CortnaLpest; Supplemento al Nuovo Cimento 10, 83, 1953 
[7] N. Bour and L. RosenrELD; Phys. Rev. 78, 794, 1950 
[8] E. CoRINALDESI; Nuovo Cimento 8, 494, 1951 
[9] E. CorinALDEs!; Nuovo Cimento 9, 194, 1952 
{10] C. Hurst; Proc. Cambr. Phil. Soc. 48, 625, 1952 
W. THIRRING; Helv. Phys. Acta 26, 33, 1953 
A. PETERMANN; Pliys. Rev, 89, 1160, 1953; Archives Geneve 6, 5, 1953; Helv. 
Phys. Acta 26, 291, 731, 1953 


On Quantum Electrodynamics 95 


(LE) 5 Cece, Pro teas Acu! Ba 37, 452 WIL. Paw. Mer. 62, 984, 
S7ED | OB, 71k 24, 1980, 92, 128%, 1952, 93, 415. 19m. G4, 1942. 1984 

(J2] & Pesexis; Proc. Ray. Soc. A 244, 143, 1952 

(20) Peewee, Voor Chr @ 1.8L: MO, 179, Pye: 12, 434, 19% 

fi6l& & Coenen, Mowe Clecan 10. 199%. 9957) 1], 492, 1994: 12, Hi, 1954 

{15} H. Umezaws and A. Visconm (in the press) 

(MC) Gatemn 1 Camrnsion ome S homnrecnt, Progeess in Theoreticas Physics 
7,377, S61, 1952 

{17} A. Peremssasen; Hel. Phys. Acta. 27, 441, 1954 

(15) S. F. Exwauns; Phys. Rer. 90, 284, 1953 

[39] A. Perezstases (in the press) 


QUANTUM THEORY AND RELATIVITY 
O. Klein 


Introduction 


FroM the two main physical theories of this century, relativity theory 
and quantum theory, two general viewpoints have emerged, that of 
relativity and that of complementarity, the fruitfulness of which would 
seem to be far from exhausted. While the former of these two view- 
points was not unknown to older physics, although its development in 
the hands of EINSTEIN enormously surpassed anything dreamed of in 
the last century, the viewpoint of complementarity was unknown to 
physics before quantum theory, where it appeared in Bour’s funda- 
mental work on the physical meaning of the mathematical formalism 
of quantum mechanics, to which the line of thought embodied in his 
correspondence principle had so largely contributed. As is well known, 
the reconciliation of the two viewpoints has from the early days of 
quantum mechanics presented physicists with unexpected difficulties, 
which are still not fully surmounted in spite of ihe important progress 
made in later years in quantum electrodynamics and related problems. 


While non-relativistic quantum mechanics is a correspondence 
theory in the sense that a direct application of the formal rules of 
quantum theory to the corresponding classical formalism is almost 
unambiguous* and leads to physically correct results the situation is 
more complicated as soon as relativistic invariance of the formalism is 
required. Partly this is due to the circumstance that here we meet with 
problems—as in the attempts to follow up YUKAWaA’s idea of new kinds 
of fields and particles connected with the interaction of nucleons— 
where we have no classical theory. But also in the treatment of electro- 
magnetic fields and electrons, from which classical relativity has to a 
large extent originated, characteristic difficulties have appeared, which 
seem to point to a limitation not of the points of view of comple- 
mentarity and relativity but to that of the correspondence treatment of 
quantal problems. 


* There is the well-known ambiguity with respect to the coordinates to be used in the 
Hamiltonian, when it is taken over into quantum theory, and, moreover, for identical 
particles, the choice between symmetric and antisymmetric solutions which may be said 
to point beyond the correspondence procedure. 


96 


Quantum Theory and Relativity 97 


At the very outset of relativistic quantum mechanics two kinds of 
difficulties appeared. The one concerned the problem of incorporating 
the electron spin into a relativistically invariant wave equation. The 
other had to do with the negative energy states, which as Dirac pointed 
out, could not in quantum theory as in the classical theory be simply 
discarded, because they are in general connected with the positive 
energy states by means of finite transition probabilities. As is well 
known, DiRAC was able to give a very satisfactory solution of the first 
problem, while the second problem led him to his famous theory of 
positive and negative electrons. Although Dirac in developing his 
theory was led by correspondence arguments and aiming at a theory for 
single electrons in external fields satisfying the general scheme of 
quantum mechanics, the theory in its final form became a formalism 
referring to a system comprising an arbitrary and, in general, variable 
number of positive and negative electrons, thus transgressing the 
original correspondence frame. 


While this theory is usually and most directly formulated by means 
of the Jordan-Wigner rules of “‘second quantization” corresponding to 
the Pauli exclusion principle, the scalar relativistic wave equation may, 
as shown by PAULI and WEIssKoprF [1], be made the basis of a theory 
of an arbitrary number of positive and negative particles of zero spin 
and symmetric statistics. This theory, of which it is still doubtful 
whether it has any application to real physical problems, forms, so to 
say, a link between the Dirac theory and the quantization of the 
Maxwell equations. 


Even this theory, however, the oldest example of quantum field 
theory is, as is well known, not free from formal difficulties, connected 
with the absence of the time derivative of the scalar potential in the 
classical Lagrangian. They appeared already in the early work of 
HEISENBERG and PAULI [2], where they were provisionally surmounted 
and a proof given of the relativistic invariance of the theory. However, 
due to the circumstance mentioned (the absence of a quantum condition 
for the scalar potential) it was not possible to give the commutation 
relations an explicitly invariant form. This difficulty came to play a 
prominent rdle in later years in connection with the recent develop- 
ment of a strictly relativistic perturbation theory for quantum field 
theory by TOMONAGA, SCHWINGER, FEYNMAN and others. The most 
obvious attempt to treat all four potentials on equal footing by adding 
the Lorentz condition (that the divergence of the potential four-vector 
ought to be zero) to the wave equations for the four potentials, a 
procedure found so convenient in classical electrodynamics, proved to 
be in contradiction with the commutation relations. And even the 


98 O. KLEIN 


proposal of Fermi [3], not to regard the Lorentz condition as an 
operator equation but as a condition on the state vector of the whole 
system, needed a deep-going revision of the whole formalism—developed 
by GupTA and BLEULER [4]—in order to give a consistent, relativistically 
invariant theory. Moreover, this result was only reached by a restriction 
of the gauge invariance of the theory, which is hardly justified on 
physical grounds. We shall later on come back to the Fermi-Gupta- 
Bleuler theory which seems a most interesting example of a deviation 
from the ordinary scheme of quantum mechanics, where one starts 
from a classical Lagrangian by means of which the canonical variables 
are defined and the quantum conditions introduced. 


I shall not enter here on the divergence difficulties which play so 
large a rdle in the present discussion of quantum field theory. But here 
stress shall be laid on the well-known fact that the invariance demanded 
by the special theory of relativity together with gauge invariance are 
not sufficient to decide between an infinite number of different possible 
generalizations of electromagnetism in order to cover the experience 
presented by nuclear and meson physics. Although a number of 
interesting suggestions have been made in connection with the 
increasing knowledge of the types of reactions occurring in this region, 
we seem still very far from a definite theory. 


Here the power of the point of view of general relativity in disclosing 
the laws of gravitation should be remembered. As has often been 
stressed by EINSTEIN, these laws, although they are simple when 
regarded from the point of view of relativity, have so complicated a 
mathematical structure that extracting them from empirical knowledge 
without the aid of any such deep-going principle wouid have been a 
hopeless task. Now, it is very usual to regard the point of view of 
general relativity as insignificant in quantum theory because the direct 
effects of gravitation in ordinary atomic phenomena are very small. 
This, however, may easily be the same kind of fallacy, which it would 
have been to regard the electron spin as unimportant for the formulation 
of the laws of chemical binding, because the direct interaction between 
spin magnetic moments is, in general, negligible compared with 
chemical binding energies. In the following we shall tentatively take 
the point of view that general relativity is fundamental for the formula- 
tion of the laws of quantum field theory and that the demand of an 
adequate formulation of other invariance claims, e.g. that of gauge 
invariance, should be regarded as an indication of the need of a natural 
generalization of the relativity postulate. As a kind of programme we 
shall thus put forward the following claim: The operators to be 
used in quantum field theory should have a simple connection to a 


Quantum Theory and Relativity 99 


transformation group (so far insufficiently known) which contains the 
general coordinate transformations in spacetime as a subgroup. The 
quantum conditions ought to characterize the group in question. 


In trying to develop a theory according to such a programme it 
should be kept in mind that the direct quantization according to the 
ordinary scheme of quantum mechanics of the Einstein equation meets 
with difficulties of the same type, but very much enhanced, as those 
met with in the quantization of the Maxwell equations*. Also from 
this point of view it would seem preferable to start with the quantum 
conditions expressing group properties instead of starting with a 
Lagrangian density. This would probably make the theory still more 
symbolic and remote from direct observation than ordinary quantum 
field theory. Here we must have recourse to correspondence in the more 
general sense of the word, that, as so strongly emphasized by Bonr, 
the non-classical laws will have to contain the macroscopic classical 
laws governing our tools of measurement as a limiting case, without 
which no connection would be created between the symbols of the 
theory and observational data. In EINSTEIN’s general relativity theory 
this connection is, as is well known, furnished by the principle of equiva- 
lence, according to which gravitational fields may be removed within 
an infinitesimal region of space-time by means of a change of the 
coordinate system. Now, a principle of this kind is missing in electro- 
magnetism in its usual form. In a generalized quantum-relativity 
theory, comprising also electromagnetism and perhaps meson fields 
corresponding to the nuclear forces, there would probably be some 
kind of generalized equivalence principle. It may perhaps be worth 
while to see whether the divergence difficulties of present theories 
could possibly be ameliorated, when all fields of force satisfy a principle 
of this kind. 


General space-time transformations and quantum conditions 


As a preliminary to the programme in question we shall now consider 
the group of general space-time coordinate transformations from a 
quantal point of viewf. Let thus x1, x, x3, x4 be the four space-time 
coordinates regarded as c-numbers, x', x*, x? forming a space-like 


* See in this connection L. RosENFELD, Ann. d. Phys. 5, 113, 1930, where the impor- 
tance of transformation groups for the formulation of quantum conditions in field 
theories—among others that of general relativistic invariance for the quantization of the 
gravitational ficld—has been strongly emphasized. See also a number of more recent 
papers by BERGMANN, etc. (P. G. BERGMANN and R. Taomson, Phys. Rev. 89, 400, 1953, 
where references to earlier papers may be found) S. N. Gupta, Proc. Phys. Soc. A65, 
160, 608 (1952). 

+ The units are the usual ones, % = c = 1 and Heaviside-Lorentz units for the 
electromagnetic quantities. 


100 O. KLEIN 


hypersurface for any given value of the general time coordinate x’. 
An infinitesimal coordinate transformation is given by 


KH = x" + EM(x), (1) 


the é“(x) being arbitrary, infinitesimal c-number functions of the four 
coordinates*. Then a vector with the covariant components A,(x) at 
the point x is transformed according to the formula 


A(X) = A(x) — &{x)A,(x) : (2) 
(summation with respect to v) where 
Pe 6g 
efx) = SO. (3) 


According to (2) the expression A,(x)dx“ is an invariant of the 
transformationf. 

In order to determine the commutators characterizing the group of 
transformations (2) we regard the 4,(x) for different - and x-values as 
a one-row matrix A, writing (2) in the form 


A= Al —Q) (4) 
where Q is a quadratic matrix with the elements («, +|Q|.’, x’). Thus 
we have 

(AQ)u, x = JA, (x')d*x'(u’, x'|Q|u, x), (5) 
where a summation with respect to x’ is implied. We put now 
(wx Q|u, x) = Jed’, x'|Or0) |, 9). (6) 
so that, comparing (5), (6) with (2), we get 
CH, x| Fy) |’, x) = Our — y)O4x — x’), (7) 


where 64(x) is a four-dimensional d-function. By means of (7) we form 
now the characteristic commutation expressions [Q7(x), Q7(x’)] and 
obtain after a straightforward calculation 


[Ql(x), QF] = Sx — x'NOwOP(X)  - b7.QF-(x)} (8) 


These relations must hold for any representation of the group in 
question. 


* Because we wish to avoid the introduction of a metric at this stage of the theory, 
leaving that till we come to the gravitational ficld, and since we shall need space-time 
volume integrals and four-dimensional )-functions, we shall restrict the coordinate systems 
so that the space-time volume element is invariantly given by d4y = dw'dx2dx8dx4. 

t We have here used the inverse transformation, i.c. the transformation of the covariant 
vector components, because in the example given later the Ay are going to represent these 
components of the electromagnetic potential vector. 


Quantum Theory and Relativity 101 


We shall now consider a special representation of the group closely 
connected with the formalism of symmetrical field quantization*. Let 
thus A,(x) and B“(x) be a set of operators satisfying the commutation 
relations 


[4,(x), Ay(x')] = [B*(x), BY(x’)] = 0, 


ae (9) 
(Ae), Bex )| = 16,0 x — x’). 


With Ore) = 5 (ACB x) + BAC) (10) 


the relations (8) are easily shown to be fulfilled. Moreover, with 
Q = Je(x)d*xQi(x) (11) 
we get directly, using (9) and (10) 
A(X) = (1 — Q)A,x)(L + Q) = A.) — &.(0) 4.0%), 
Bex) = (1 — Q)BAx)(1 + Q) = Bux) + ef(x) B(x). 
The A,(x) thus transform as covariant and the B“(x) as contravariant 


vector components, the commutation relations maintaining their form 
during the transformation. 


We shall now consider the operators which transform the field 
quantities A,(x), B“(x) at a point x into the corresponding quantities 
at a point x + 0x, whereby the 6x* fulfil the restriction 


obx" 


(12) 


a = 0. (13) 
Putting Ty) = — 3 (SO 2) + Pw) (14) 
and . T = ifdx'(x)d*xT,(x), (15) 
Reinve [A,(x), T] = 8x°(x) oe 


and thus 


(1 — T)A,(x)(1 + T) = A,(x) + 6x°(x) 0A,(x) 


bed 


= A,(x + 6x), (16) 
and further : 
(1 — T)BAx)(1 + T) = BAx) + se (6x"(x)BY(x)) = BY(x + 6x), (17) 


* The connection between the operators occurring in the formalisms of symmetrical 
and antisymmetrical quantization respectively with linear transformations has been 
stressed by JorDAN, Z. Phys. 75, 648, 1932 and 94, 531, 1935. 


102 O. KLEIN 


where we have assumed that a partial integration may be performed 
without contributions from the boundary, and where in the last step 
the restriction (13) has been used. 


Further we put Poe daa, (18) 
the quantities iP, being thus operators for unit displacements of the 
coordinates corresponding to 5 1.€, 

ju) _ 1p, AC) 
 OBMx) | i 
i> = IP, BY). 


The P,, although time-independent by definition, play in this respect 
the rdle of the momentum-energy vector, the fourth equation being, 
With. 1, Ppa, 

OA 


is, =|, Hl, 


where A is any of the quantities 4,(x), B¥(x). 


So far, however, we are very far from a physical description of the 
field, since the field operators of different space-time points are com- 
pletely independent. This in itself does by no means contradict the 
basic ideas of quantum theory, where the causal relationship between 
events is replaced by a probability relationship. But we need restrictions 
for the permitted state vectors ‘” by means of which such relationships 
may be formulated. Such restrictions will also lead to a restriction of 
the P,-operators corresponding to the energy-momentum principle. 


Electromagnetic fields without sources 


In order to get an idea of the way in which the considerations of the 
former section may be applied to definite physical problems, we shall 
here outline a provisional treatment on these lines of the quantization 
of the electromagnetic ficld but with the ordinary limitation corres- 
ponding to special relativity, (12) being now restricted to Lorentz 
transformations. This, however, is to be regarded as a deviation from 
our programme where the electromagnetic field should be introduced 
in connection with an adequate generalization of the gauge transforma- 
tion group. We shall see that the electromagnetic field quantities may 
be derived from two vectors 4,({x) and B(x) fulfilling the commutation 
relations (9) of the former section. 


Quantum Theory and Relativity 103 


Let us first expand these vectors according to a complete set of 
orthogonal and normalized eigenfunctions U,(x), UX(x) of the four 
space-time coordinates, which we shall later on assume to be free wave 
functions of a four-dimensional periodicity parallelepipedon. Thus 
we have 

[@xU,C)UEC) = du. (20) 
We put now 


A(x) = 2A AKU), BY(x) = > BMk)UR(x), (21) 


k 


from which follows 
Ak) Va Ane U 7), Bk) = fatxB*O)Ua), (22) 


which by means of (9) and (20) gives rise to the following commutation 


relations 
[A.(k), Aw(k’)] = [BMk), BY(k’)] = "| (23) 


[A,(k), Be (k’)] = 18 yu Onn’ 


Let us introduce the expansions (21) into the expressions (18), (14) 
for the translation operators, assuming hereby U,(x), Uf (x) to be of the 
free wave form, so that 

OU, (x) ; : 
57 = 1k,U,Q), (24) 


k being now a shorthand notation for the four real c-number quantities 
k,. Hereby the completeness of the set of U,(x)-functions requires 
that these take all positive and negative values consistent with the 
periodicity condition, the k-vectors being space-like as well as time-like. 
We obtain thus 


P= 5 SK(AWBW) + WAM). (25) 


We shall now assume that in a given coordinate system the space 
components of the A and B vectors are hermitian operators, while the 
time components are antihermitian. According to (10) Q{(x) will then 
be antihermitian, when both indices are space-like or both time-like, 
but hermitian, when one index is space-like and the other time-like. 
This means that with a choice of the coefficients «7 corresponding to a 
space rotation the operator of (12) is unitary, while for a special 
Lorentz transformation, corresponding to a pure translatory motion 
of the new coordinate system, Q —Q*. Now, this is a well-known 
property of the Lorentz transformation. In order to put (25) into a 


104 O. KLEIN 


convenient form and in accordance with what has just been said we put 


C0) + C(— © 


Ak) = (ofk)g*)-"? 


v2 06) 
Bu(k) = (w(k)gt)¥7i Eu) ea =) 
Here "  gh=: gi = gS = — pt= I, (27) 


while w(k) is a positive, even c-number function of k, which may be 
arbitrarily chosen. We shall later make a definite choice, which 
connects our operators with those of the ordinary theory. With w(k) 
fixed the C, and C7 are uniquely determined by means of the A, BY. 
In fact, we have 


ae 


CA) = 3 (CoWg" 7A, (b) + ilolk)g")?BY(— K)), 
(28) 
CHE) = Fe (OW g"24,(— 1) — ilo(gt""B0). | 
With = A,(— K) = gti), BY— KB) = MBM), 29) 


which is seen to correspond to the above assumptions about A,(x), - 
B*(x), we see that C7(k) is the conjugate imaginary of C,(k). Moreover, 
from (23) we get the following commutation relations: 


[C,(k), Culk’)] = [Crtk), Crk) = s 
[C.(k), CHAK) = Spy One's 


which show that the C,,(k) are absorption operators, the Ci(k) emission 
operators. 


Introducing the expressions (26) into (25) we obtain 
P, = 2K,CHK)CA) (31) 
k 


(30) 


which are formally the same type of expressions as those giving the 
momentum-energy components of a system of free photons in the usual 
theory. But here the number of C,, C; operators is of a higher order 
of infinity, the four k, being quite unrelated. According to our general 
programme we shall now restrict the permitted state vectors ¥ of the 
system not only by means of a relation corresponding to the Lorentz 
condition but also by relations corresponding to the Maxwell equations 
of the ordinary theory, which cannot here be written as operator 


Quantum Theory and Relativity 105 


equations, because such are not consistent with the commutation 
relations (9). For the purpose in question we introduce the following 
set of operators: 


whereby A"(k) = ane = CAX) 1 CrH(x), 
Then the Lorentz condition will take the following form: 
3C(x) 
sak tO . (33) 


and the Maxwell conditions will be 
pect) — 0 (34) 


where [] is the d’Alembert operator. From (33) and (34) we obtain 
easily, using (22) and (26), the following expectation value relations: 


ae OA WX) __ OBM(x)\ 
a! Cae 
and, moreover, 
<O4A,)> =90, <C BX(x)> = 0. (36) 


In order to see the exact meaning of these conditions we shall return 
to the Fourier expansions, which then give 


V gtk, CAk)Y = 0, (37) 
eC (k)¥ = 0, | (38) 
where k? = k2 + k2 + kB — kh? = gek?. (39) 


Now, in the case k? 4 0, by means of a Lorentz transformation (37) 
may always be brought back to the case k, = k, = k, = 0 for a time- 
like vector and to the case k, = kz = k, = 0 for a space-like vector, 


where 
C(k)y¥ = 0, 


(uw = 4 and 1 respectively) in which case ‘’ corresponds to the vacuum 
state with respect to the operator in question, the corresponding term 
in the permitted expectation values of P,, being absent. For k* = 0 we 
have the ordinary case treated by Gupta and BLEULER. Likewise (38) 
means that the terms in the expectation values of P, corresponding to 
k? + 0 will be missing. Strictly speaking k* will only take the value 0, 
when the period of the time coordinate goes towards infinity. We can, 


8 


106 O. KLEIN 


however, always define a “‘mass shell,” [5] within which |k?| < «, € 
vanishing as the period gets infinite. 

The remaining C, are associated with a set of U-functions, which 
may be chosen as orthogonal and normalized with respect to the 
three-dimensional space, i.e. 


U(r, = 


a .r—ct) (40) 


where V is the space periodicity volume, k the space vector (k,,k2,k3), ¥ 
the vector (x1,x?,x°), and where 


ok) = |k| (41) 


which implies a further restriction of the sign of ky = — w(k). The 
notation w(k) of the arbitrary factor entering in (26) was chosen in 
order to identify it at this stage with the w(K) of (41), so that finally we 
get for the restricted A and B vectors (denoting them by the same letter 
as the unrestricted quantities) 


1 ' : 
ALD = B Farmer (Caller + Cie He), (42) 
ra) 
Br) = ge AME) (43) 


With (42) and (43) the relations (36) are automatically fulfilled. 
Moreover, with the restricted quantities, the first of the equations (35) 
corresponds to the Lorentz condition, while the second equation (35) 
takes the form 


<div &) =0, (44) 


the electric field vector & being defined in the usual way. As to (43), it 
must be emphasized that the identification of the quantities on the right 
and left side of this equation can only be made for the restricted operators 
and only in a given Lorentz frame. In the usual theory, however, only 
the right hand quantity appears and plays the rdle of canonically 
conjugate of A,(r,f). It should further be pointed out that the restriction 
(41), although Lorentz invariant, is not very satisfactory from a formal 
point of view. But in this connection it should again be stressed that 
according to our programme the electromagnetic field ought to be 
introduced by means of a more essential generalization of the Einstein 
theory of gravitation and not in the way used provisionally in this section 
in order to compare the present procedure with the usual one. 


Finally it should be pointed out that the calculation of real expecta- 
tion values also in this representation has to be made according to the 


Quantum Theory and Relativity 107 
procedure of Gupta and BLEULER. Thus we have to introduce an 
operator 7 with the properties 

n=", W=1, Ax) = grAaxn, Bix) = gtBY(x)n. (45) 
As is easily seen, these properties are fulfilled by 


im Ct(k) Ck) 
ek ° 


n= (46) 


Let now R be the operator for a finite Lorentz transformation. Then 
it follows from (10) that 


Bhan: (47) 
From Ai a AR 
it follows therefore that 
NA (X) = R'nA(x)R. (48) 


And since all four components of 7A,(x) are hermitian according to 
(45) we see that this property is retained in the Lorentz transformation. 

As is done by GupTA and BLEULER, this is best expressed by means 
of a new metric for the calculation of expectation values. Introducing 
thus an adjoint state vector ‘’ given by 


es (49) 
the expectation value of an operator A is 
<A> = (P, AY) = (¥*, nAP). (50) 


In a Lorentz transformation, instead of transforming the operators, 
we shall then have to transform the state vectors according to the 
relations 


w= RV, Ye aerR, Y =PR1, (51) 


Although the quantities C,, C7 used in defining the restrictions are 
introduced in a particular Lorentz frame, it thus follows that the 
theory is, in fact, Lorentz invariant. 


The S-transformation of Dirac wave functions 


In the Dirac theory that linear transformation of the wave-function 
components which corresponds to a Lorentz transformation plays a 
well-known réle. Its generalization, i.e. a transformation of the wave- 
function components in which the coefficients are arbitrary functions of 
the space-time variables, is very fundamental in the generally relativistic 
form of the Dirac equation, which is not only to be invariant with respect 
to coordinate transformations but also with respect to the group of 


108 O. KLEIN 


linear transformations just mentioned, a point which has been 
particularly stressed by BARGMANN*. 


We shall now treat these transformations in a similar way as the 
general coordinate transformations were treated in section |. Thus to 
every space-time point x we associate a number of operators ¢,(x), 
w,(x), where the index « takes as many values as the number of columns 
or rows in the Dirac matrices. For the general theory we need not fix 
this number. And we assume the following commutation relations for 
these operators: 

{b,(x); Po{x')} — Sx O*( ay x; 
{b.(x); pux')} a {y.dx), Pa(x')} = 0. 


Let now A and B be two arbitrary matrices of the same number of 
rows and columns as the number of components yw,, the matrix elements 
being c-numbers, either constant or functions of x, then the following 
formula is easily proved by means of (52): 


[Ja*xdx)Ay(x), Jd*x'd(x’)By(x’)] = Ja*xdQ0[A,Bly(x). (53) 


Here by Ay we mean as usual X¢,(a|A|8)ys. From (53) we obtain 
easily the following formula, which we shall now have to use: 


(52) 


eSa fd4xd(x)By(x)eS4 = {dtxd(x)e~4 Be4y(x) (54) 

where S44 = fdtxd(x)Ay(x). (55) 
The formula (54) may be regarded as a consequence of the two relations 
e~“ay(x)eSs = eAy(x), (56a) 

eSad(x)e%a = d(x)e~ 4, (56b) 


which are likewise simple consequences of (52). Formula (56a) 
expresses a general linear transformation of the components of y(x), 
the coefficients being c-number functions of x, and (56b) is the inverse 
transformation. 

We shall now fix our attention on the linear transformation of the 
y-components corresponding to a Lorentz transformation, denoting 
the A in question by L. Then we can write 


L= teeny", Ew = — ae (57) 
y" being a set of Dirac matrices satisfying the relations 
{y", v7} = 2g"8 ys (58) 


yi, y*, y® being hermitian and y* antihermitian. 


* V. BARGMANN, Berl. Ber. 1932, p. 346. From a quantal point of view also this 
transformation has been investigated by ROSENFELD (l.c.). 


Quantum Theory and Relativity 109 


Here we meet again with the well-known non-unitary character of 
the Lorentz transformation, which is at the root of the Gupta-Bleuler 
procedure. In this connection it means that, although we can choose ¢ 
as the conjugate imaginary y* of y in a given frame of reference, this 
property is not conserved in a Lorentz transformation. Here we shall 
briefly indicate how by a similar procedure to that of Gupta-Bleuler 
we may maintain both the commutation relations and the reality 
properties of operators during a Lorentz transformation. 


Since according to (57) we have 


elt —¢@ 2” e-Le2” (59) 
it follows as above that 
en = New \, (60) 
where A is the unitary operator 
a eC a)d*z 
ee Oe (61) 


where ¢(x) ought to be chosen equal to yt(x). Then for any quantity 
A formed by means of the ¢(x), (x) operators we have 


A= € Aer = Nie: KAews, 
or AA’ = er AAe*. - (62) 


If now, similarly to the Gupta-Bleuler procedure the expectation value 
<A> is calculated according to the formula 


<A> = (¥, AYP) (63) 
where eK. (64) 


then according to (62) the reality properties of the expectation value 
are not altered by the transformation. Instead of transforming A it is 
convenient here also to transform Y’, ‘’* and Y according to the 
formulae 


Y — Si, WH wees Py _ Fes | (65) 


so that we have — 
<A>’ = (Y"”, AP"). (66) 


Let us now consider the translation operators, which again we denote 
by iP,. They are given by the following expressions: 


3 3 
P,= — 5 fatx (4,09 ae ~ a 4,0). (67) 


110 O. KLEIN 


In fact, we have by means of (52) 


(1 — idee) (NI + 8x°P,) = (x) + 820 EO? 
(68) 
(1 — ibe Py) + 15x°P,) = pala) + 67 HOD. 
From (52) it follows further that with 
£ = 5 Jaf(EOIVO) — yOI$O)), (69) 
one has e~9 w(x)e4 = y(x)e™, (70) 


et h(xjel? = p(xje™, 


where f(x) is again an arbitrary c-number function of the space-time 
coordinates. This gives expression to the gauge transformation of the 
electron wave-functions. As is easily seen, the operator 


e—Ndaf(z) 5- or ee 


plays a similar role for the electromagnetic field, so that with 


OB"(x) 


ax” 


G = Jatxf(x) (— + Foy) — yd}, 7) 
Zs) 


e’* may be regarded as the gauge transformation operator of the 
combined electromagnetic and electron systems. In fact, 


of(x) 


e 4A (xe? = A(x) — ae 


which is the gauge transformation of the potential vector. As we shall 
see the expectation value of 


g= — 551d), vod (72) 


in the restricted states is that of the total electric charge of the electron 
system. It should be pointed out that the above treatment of the gauge 
transformation is still provisional, in that it does not take account of 
the existence of neutral particles like the neutrino. 


Let now ul (k)e™, ul" *(k)e-™* be a complete set of four row 
eigenfunctions for a four-dimensional periodicity parallelepipedon, 


Quantum Theory and Relativity 111 


whereby for every set of &,, there are four u'"(k), two of which satisfy 
the equation 


P) 
(1 axa ale m) u=Q, (73a) 
two the equati B 2s = 0 73b 
quation 7" 5 — mju=0, (73b) 


m being one root of the equation 
me -- gtk = 0, (73c) 


the k, taking all positive and negative values consistent with the 
periodicity condition. We develop the operators p(x) and ¢(x) 
according to these functions, putting 


p(x) = 2 a(kum(kKje™, | 
TK 


74 

Ho) = S aflhyue*(ke-M 
t,k 

assuming thus that in the frame of reference chosen 4(x) = p*(x). 

From (52) and the completeness of the set of u'”(k) functions it follows 

now that a,(k), a;(k) will satisfy the commutation relations 


{a,(k), artk')} = {ar (k), ar(k’)} = "| (75) 
{a,(k), ay({k’)} ae Ory’ Ong’ 
Introducing the developments (74) into (67) we get 
P, = 3 > kfar(k)a,(k) — a,(k)ar(k) ), (76) 
kr 
and further for the quantity g of (72) 
q = — 55 a? Walk) — a,(kaz(k) ). (77) 


We divide now the k-states in such belonging to positive energies — 
(negative k,-values) and such belonging to negative energies*. Putting 
for a positive energy state 


N,(k) = ar(k)a,(k) (78a) 
and for a negative energy state 
M,(k) = a,(— k)a;(— k), (78b) 


* As has kindly been pointed out to me by Professor W. PAULI, this division is rela- 
tivistically invariant only for time-like k-vectors (corresponding to real m-values) and not 
for space-like k-vectors (corresponding to imaginary m-values). As we shall see below, 
this would not seem to invalidate the relativistic invariance of the theory. 


112 O. KLEIN 


(a,(k) being thus annihilation operators for negatons and a;(— k) such 
operators for positons) we get 


P, = > (NA) — 3) + (M4) — 2) ho, 
E>0 (79) 
eee eek (N,(k) — M,{k) ), | 


where E = —k,. These formulae differ from the ordinary formulae 
of electron theory only by the inclusion of an infinity of states corres- 
ponding to wrong mass values. Just as in the former section we used 
relations equivalent to the Maxwell equations as conditions for the 
state vector in order to connect the treatment with the usual procedure 
we shall here use the Dirac equation in a similar way, fixing the mass 
value appropriately. Since we are neglecting the interaction between 
electrons and electromagnetic fields the question of the self-mass does 
not appear. 


First, thus, we divide the k-states into those belonging to time-like 
k-vectors (real m-valucs, class I) and those belonging to space-like 
k-vectors (imaginary m-values, class II), a division which is relativisti- 
cally invariant. Then we divide class | into states of positive energy 
(class Ia) and states of negative energy (class Ib) and class II into states 


of positive . (class IIa) and states of negative “ (class IIb), all four 


subclasses being relativistically invariant. To make this division clearer 
we shall consider a slightly more general case, namely an electron 
system subject to an external electromagnetic field, which is not 
pair-creating. Let then u(x) be a solution of the Dirac equation 


( (< + ieA® ‘c)) + m) ue 0; (80) 


where Aj(x) are the potentials of the field in question, and where m 
is an eigenwert parameter, to be suited to the four-dimensional 
periodicity condition. Then if K denotes the unitary, symmetric 
matrix occurring in the Kramers theory of charge conjugation, with the 
property 

Coe ek, (81) 

r) 

we have (> (55 - ieAi(~)) + m*| ‘Kune (x) 0. (82) 
Now, class Ja will be described by means of solutions of (80) with 


positive energy, but with both positive and negative masses, while 
class Ib will be described by means of solutions of (82) belonging to 


Quantum Theory and Relativity 113 


u-states of negative energy and both signs of m. Similarily class Ia 
will be described by means of solutions of (80) with both signs of the 


energy and positive sign of = while class IIb consists of the solutions of 
(82) belonging to u-states with both signs of the energy and negative 
sign of “ Since both a-classes satisfy (80) and both b-classes (82) we 


may call a the negaton class and b the positon class. Denoting now the 
a-class part of (x) by ,(x) and the 6 class part of K¢(x) by x,(x) both 
%_ and y, will only contain absorption operators and we may write the 
conditions for the state vector ‘’ in the following form 


DAY = 0, Dio = 0, (83) 
with 
P) - é hae 
De=y" ( ae +- ieA “c9) + ny, D, = y# (< — eA ‘(2) “ips 
(84) 


My being the electron mass. Let now u™(x,m), v™(x,m) be a complete 
system of cigenfunctions of (80) in the four-dimensional periodicity 
volume satisfying the orthogonality and normalization conditions 


fax, EO CM) = Say Omm's 
fase" ao (xm). = On Omms } (85) 
fdA xl" Gov G:) = G, 
the u belonging to class a, the v to class b. Then we may put 


XX) ay . a,{m)u”(x,m), 
ysl) = > bylor) Ko™ (em), (86) 


where b,, as well as a, are absorption operators. It follows that (83) is 


equivalent to 
a,myY=0, b,(m)Y=—0, m -~ mp. (87) 


Applied to the case of free particles this leads immediately to the 
dropping out of all the “wrong” N,(k), M,(k) from the expectation 
values of the expressions (79), which will thus agree with the expectation 
values for the energy-momentum vector and the total electric charge 
of ordinary electron theory. As is easily seen, similar results will hold 
in the presence of an external, static, electromagnetic field. Let us, how- 
ever, consider the more general case, where A(x) are time-dependent. 


For this purpose we need a time-dependent operator which in the 


114 O. KLEIN 


static case goes over to the operator ~ P, and which in a stricter sense 
than this one corresponds to the Hamiltonian of the ordinary theory. 
We obtain this operator if for a moment we regard time as a discrete 
variable corresponding to a periodicity condition for k,, the period of 
which shall then approach infinity. With this assumption the com- 
mutation relation for the operators corresponding to ¢(x) and y(x), 

(we call them 4(r,f), @(r,t) ) takes the form 


- {£(r,1), P(r',t)} = Br — r')by. (88) 


Let us now consider the following operator 


P,t)dt = — 5S ax(b(r,A)\(G(r,t + dt) — P(r,t)) 
— (p(r,t + 62) — P(r, 67,0), 


where ¢t and ¢ + df are two possible, adjacent values of the time variable. 
By means of (88) we get then 


[p(r,.), iP,()6t] = p(r,t + 6t) — P(r.) = dy(r,0), 
or going to the limit t > 0, 


[p(r,t), PO] = 


we t) Sen t) 


“ee t) (89) 


: i 
with PD) = — 5 fate (Fe. Bed), 0 
where, for given f, A(r,t) and P(r,f) satisfy the commutation relation 


{$07.0 Br',)} = d%(r — 7’). (91) 


Hitherto, in this investigation, we have used a kind of Heisenberg 
representation with operators for every value of ¢ and time-independent 
state vectors. In order to approach the ordinary theory we shall now 
introduce time-dependent state vectors so defined that for any two of 


them, say O(f) and ‘Y(0), 
(D(t), a(t + dV (0) ) = (O(t + 6), a(¥(t + 52) ), 


a(t) being any time-dependent operator. Since, according to (89), 
] + iP,(t)dt performs the transition from ¢ to ¢ + dt, this means that 

i, Se aT (93) 
which is the Schrédinger equation, — P,(7) playing the role of the 
Hamiltonian. We may now, say at ¢ = fo, take a ‘¥’ satisfying the con- 
ditions (87), i.e. corresponding to the vacuum state for all the “wrong” 


Quantum Theory and Relativity 115 


mass-values. In order to maintain these conditions P,(t) must not 
contain any of the wrong operators. In other words (x) and (x) 
must only contain permitted state operators. This we obtain by taking 
for ¢ and @ the expressions 


P(x) = 2(an w™ (r,t) + bH™(r,1) ), | 


Bo) = Sata") + b,5*(%0)), 04) 


where for shortness we have put 
An = 4,(M), 5, = b,(mp). (95) 


Here a (r,t), (r,t) are three-dimensionally orthogonal and 
normalized eigenfunctions of the equations (80) and (82) respectively, 
whereby, as is easily seen, 


i™(r,t) = a 
BM(,1) = VTo'(x,m), 

T being the time period of the four dimensional periodicity volume, to 

which the functions u and v belong. In fact, this follows from the 


continuity relations belonging to the Dirac equations, according to 
which integrals of the type 


fd3xu™*(x,mo)u(x,mp) 


(96) 


are time-independent. 


Introducing the developments (94) into the expression (90) for 
— P,(t) we get now, assuming as above that there are no negaton- 
positon interaction terms, and using the Dirac equation satisfied by the 
functions u and v, 


oa P,{t) ay ys, endn — an an (n|\7q|n’) 
al + Hbiby — bybt)\n|n|n')}, (97) 


re) r) 
with 7), iy! D, +i= = Me = iyt*D, + iz y being one-particle Hamil- 


tonians for negatons and positons respectively. As is seen, this is in 
conformity with ordinary electron theory and (97) thus contains the 
expression (79) for — P,asaspecial case. This equality is a consequence 
of the formal conservation theorem for the energy of a Dirac equation 
with time-independent potentials. 


As to the interaction between the electron system and the quantized 
electromagnetic field we shall limit ourselves to a few words. Here we 
shall have to modify the Maxwell conditions so that they resemble the 


116 O. Kiem 


corresponding operator equations of the ordinary theory as closely as 
possible, and also the Lorentz condition has to be modified in a way 
approaching that proposed by BLEULER*. Moreover these conditions 
must be so chosen that they are compatible not only with one another 
but also with the Dirac conditions. Now, these latter conditions—if 
they are taken in the most obvious way, i.e. by replacing the potentials 
A‘, in (84) by the quantized potentials A,—contain also emission 
operators to ‘‘wrong” photon states, which probably means that in 
the generalized theory such states will occur as intermediate states. 
In the present unfinished state of the theory it is still uncertain, whether 
it will be possible or not to find such a formulation of the interaction 
problem on the lines here proposed, where the presence of such 
“unphysical” intermediate states will not upset the agreement 
exhibited by present electron theory with refined experiments. Still 
there may be some reason to expect that the most obvious formulation 
of the interaction problem on these lines should already give rise to a 
normalizable theory, the effect of the ““wrong”’ states being limited to 
further infinite contributions to the mass and charge of the electron. 


Finally it should be pointed out that the fact that the A,(x) potentials 
to be introduced in the Dirac conditions commute with one another 
for arbitrary space-time points, may present a certain advantage. Thus 
we can formally use the same procedure as with external fields, i.e. 
find the transformation operator which leads from the eigenfunctions 
of the field-free Dirac equation to the solutions in presence of the 
A,{x)-field. The transformation operator will then appear as a function 
of the A,(k) of (22), and a state vector 'V' of the system may be expressed 
by means of the transformation operator in question and a state vector 
of the system without interaction. 


Concluding remarks 


The comparatively simple way in which quantum electrodynamics can 
be represented when use is made from the start of the point of view of 
general relativistic invariance would seem to justify some confidence 
in the soundness of the programme outlined above, although what is 
given is only a rough sketch, the more important problems being as yet 
unsolved. Apart from the interaction problem of quantum electro- 
dynamics, which has only just been touched upon in the above pages, 
there is in the first place the generally relativistic Dirac equation and 
its possible generalization, which remains to be treated. 


In concluding a paper on quantum theory and relativity for this 
Boukr celebration volume the famous discussions between BOHR and 


* K. BLEULER, l.c. p. 578. 


Quantum Theory and Relativity Ly 


EINSTEIN come to mind, which were most charmingly described by the 
former in the EINSTEIN 70th birthday celebration volume. The writer 
remembers very vividly Bour’s disappointment when returning to 
Copenhagen from the Solvay congress in 1927, where he had presented 
his general philosophy of complementarity, because EINSTEIN, the 
founder of the theory of light quanta, was absolutely unwilling to accept 
his ideas. We know from BouRr’s account how ingeniously EINSTEIN 
defended his standpoint—the essential incompleteness of the quantal 
description of nature—and how Bonk refuted every one of his argu- 
ments with more than ingenuity. What impressed us younger people 
most was, I think, the “Einstein box,” where Bour successfully turned 
general relativity theory against EINSTEIN. This was just one example 
of Bour’s deep insight in the physical ideas of relativity theory, which 
appears in so many remarks in his works. And still EINSTEIN, who 
accepted all defeats with the utmost fairness but without changing his 
basic view, may have felt that on the side of the quantum physicists the 
importance of the general relativity claim in the search for the laws of 
the microworld was usually underestimated. 


In paying homage to Bonr by the above programmatic sketch I 
should like to express the hope that the development of such con- 
siderations may contribute to an intimate alliance of the two funda- 
mental viewpoints of present physics, that of complementarity and that 
of relativity. 

My best thanks are due to Professor W. PAULI for much kind and 
stimulating criticism. 


REFERENCES 


{1] W. PAu. and V. WeisskorF; Helv. Phys. Acta 7, 709, 1934 
[2] W. HEISENBERG and W. PAULI; Z. Phys. 56, 1, 1929; 59, 169, 1930 
[3] E. Fermi; Rev. Mod. Phys. 4, 131, 1932 
[4] S.N. Gupta; Proc. Phys. Soc. 63, 681, 1950 
K. BLEuLER; Helv. Phys. Acta 23, 567, 1950 
[5] See, e.g. W. HEITLER; The Quantum Theory of Radiation, Oxford 1954, p. 153f 


ON THE THEORY 
OF SUPERCOND UICGTT) ii 


H. B. G. Casimir 


ON the twelfth of April, 1911, Bour’s thesis on the theory of electrons 
in metals [1] was accepted by the Faculty of Natural Philosophy at the 
University of Copenhagen. Almost simultaneously—on April 28th [2] 
and in more detail on May 27th [3]—KAMERLINGH ONNES reported to 
the Royal Academy at Amsterdam a surprising discovery: he and his 
co-workers had found that the electrical resistance of mercury drops to 
an immeasurably small value below 4-2 °K. 


It seems that KAMERLINGH ONNES did not at once realize the revolu- 
tionary character of this observation. He had previously given a for- 
mula according to which the resistance of ail pure metals should 
continuously drop toward zero with decreasing temperature and was 
inclined to regard the results for mercury as a confirmation of this— 
very sketchy—theory. But at the Solvay Conference in November of 
that year and in a communication to the Royal Academy at Amsterdam 
on November 25th [4] the abruptness of the transition is emphasized 
and the idea of “‘superconductivity” as a particular condition of matter 
is from then on established. 


In Bour’s thesis the kinetic theory of the motion of electrons in 
metals is treated in greater detail and on a more general basis than had 
been done by Lorentz and others before; at the same time it is pointed 
out that classical mechanics is an insufficient basis for explaining the 
structure of matter. In the modern theory of metals the ideas of quan- 
tum mechanics concerning quantized lattice vibrations, motion of 
electron waves in periodic lattices and Fermi-Dirac statistics are 
incorporated into the framework developed by Lorentz and Bonk. 
In this way a theory is obtained which gives a reasonably satisfactory— 
though in most cases far from quantitative—explanation of the electrical 
and thermal properties of metals. But even today superconductivity 
is not really understood, although a wealth of experimental material 
has been added and many phenomenological aspects have been 
clarified and although recent theoretical work of FROHLICH [5] and 
BARDEEN [6] has at least revealed an important clue. 


118 


On the Theory of Superconductivity 119 


Nothing then would suit the present occasion better than a contribu- 
tion toward such an understanding. Unfortunately, I have been 
unable to make any relevant progress. Nor do I intend to give a more or 
less complete synopsis of experimental material and theoretical ideas: 
there exist several excellent reviews of the subject. I shall confine 
myself to a discussion of certain aspects of superconductivity that, 
though well known in principle, do not always receive sufficient 
attention. 


The normal state 


There exist many metals that do not become superconductive in the 
temperature range in which they have been investigated; copper, 
silver, the alkalis are well-known examples. We say that these metals 
remain in the normal state. Whether this holds true down to the 
absolute zero we do not know for certain, although it is rather plausible. 
At temperatures above the transition point 7,, and at temperatures 
below 7,,in magnetic fields above the critical field H,(T), superconductors 
behave in all respects like these normal metals. This is one of the most 
important properties of superconductors. 


The theory describing the normal state when the absolute tempera- 
ture tends towards zero is extremely simple. If the behaviour of the 
electrons can be described by a Fermi distribution of single electron 
wave functions, then at the absolute zero all levels up to a certain 
energy & will be occupied. If n(e,) is the density of one-electron states 
per energy-interval in the neighbourhood of this maximum energy, a 
simple calculation shows that the electronic specific heat is given by 
a2 
3 
To this must be added a term AT® for the contribution of the lattice 
vibrations. Since at temperatures well below the so-called Debye 
temperature © only lattice waves with a wave-length of many atomic 
distances are excited, the constant A does not depend on any approxi- 
mate theory for the spectrum of lattice vibrations. (If, however, we want 
to calculate the specific heat at higher temperatures or the total zero 
point energy of the lattice, we have to have recourse to such approxi- 
mations.) The total specific heat is thus given by 


C, = BT + AT®, (1) 
This formula is really rather well confirmed for all normal metals that 
have been investigated in the liquid-helium range, but also for super- 


conductors in the normal state. It would be difficult to fit any other 
simple formula to the existing data and on the other hand we do not 


Ce ee 


120 H. B. G. Casimir 


at present know any other model that would lead to a similar formula 
for the specific heat. It is often convenient to write n(éy) as n/kT, where 
n is the total number of free electrons per cm® and 7; is a “degeneracy 
temperature.” This degeneracy temperature is found to be of the order 
of 104°K. As far as order of magnitude is concerned this agrees with 
the results of a calculation where the electrons are treated as free, viz. 


m 3n 1/3 
n(Ep) = a (=) ° 


T 


For some metals the agreement is even almost quantitative, in other 
cases there are considerable deviations as is to be expected since in 
many metals the electron bands may well have a complicated structure. 
It is useful to note that at a temperature T the total number of electrons 
excited above é, is given by 


vy 
Nee = N T, log 2. 


This is of course equal to the number of holes below é. 


Electrical resistance is due to two mechanisms: scattering by 
lattice vibrations and scattering by lattice imperfections, which may be 
either chemical or physical in nature. The resistance due to lattice 
vibration decreases rapidly with decreasing temperature—BLOCH’s 
theory predicts a law of the form a7°-—the other mechanism leads to 
a constant residual resistance. 


Call + the time of relaxation corresponding to impurity scattering, 
which may still depend on the direction of motion, then by a very simple 
argument j = 

i, = e'rvi n(e)E,. 


As a matter of fact the resistance of most metals drops to a constant 
value, which depends on the purity of the specimen. Unfortunately 
sometimes the resistance passes through a minimum. This behaviour 
seems to be characteristic of certain solid solutions for instance of 
manganese in gold. The only way to reconcile this with the simple 
theory would be to suppose that 7 is temperature dependent. Impurity 
levels very close to the edge of the Fermi distribution might lead to such 
a temperature dependence but unless theoretical reasons can be given 
for the existence of these levels this is a rather artificial assumption [7]. 
It is equally possible that the phenomenon indicates the inadequacy of 
the simple theory; it might even be due to the type of interaction that 
also leads to superconductivity [8]. In this connection very accurate 
measurements of the specific heat of alloys showing a minimum in the 
resistance curve would be desirable. 


On the Theory of Superconductivity 121 


The situation with respect to heat conduction is analogous. The 
resistance due to lattice vibrations should decrease according to a T? 
law. If only lattice imperfections are taken into account we have for 
the heat current 

oT — , oT — 7° oT 
es aa Tm. G,, ae Te < k?Tn(é) a 
If the residual resistance is called Ro, the limiting law for the heat 
resistance p is 
Ry 


=Tr (2) 


where the Wiedemann-Franz constant L is given by 
2 k 2 
mee 0 


The experimental confirmation leaves something to be desired, partly 
due to the fact that the influence of lattice vibrations by the thermal 
resistance drops less rapidly than the corresponding term in the electric 
resistance. The quantitative theory for the joint influence of lattice 
vibrations and imperfections is complicated and the agreement with 
experiment is far from perfect. There is no reason to doubt the validity 
of (2) and some measurements give even a value for L in quantitative 
agreement with (3). 

Summarizing we may say that there is fairly good experimental 
evidence for the validity of the simple theory, but a still better 
investigation into the validity of these limiting laws seems desirable. 


Let us now return to the assumptions underlying this theory. The 
equation for the specific heat is valid as soon as the states in the neigh- 
bourhood of the lowest state of the metal as a whole can be labelled as 
if they were states of independent electrons. The equations for electric 
and thermal conductivity suppose somewhat more, namely that these 
states are really described by independent electron wave functions. 
But it is only necessary to make this assumption for the few electrons 
that are near the edge of the distribution. For the electrons farther 
inside the Fermi distribution it may well be that interactions play a 
prominent role. We must now discuss a rather tricky point. In the 
theory of an ideal gas interaction between atoms is neglected, but it is 
possible to calculate successive approximations, the so called virial 
coefficients, and even though it may be difficult to prove rigorously the 
convergence of such a development a small value of the first correction 
term gives an indication that deviations from the ideal behaviour will 


9 


122 H. B. G. Casimir 


be slight. In the theory of metals this is quite different. The Coulomb 
interaction between electrons has only been taken into account in so 
far as it can be accounted for in terms of a self-consistent field. If now 
interaction is introduced by normal perturbation processes, in second 
approximation a divergent result is found. Qualitative arguments have 
been brought forward to explain why the simple picture applies to 
reality. Others have tried to start from a different angle and to study 
excited states of a plasma. Perhaps it is not entirely irrelevant to 
remark that the average distance between excited electrons is more 
than 10 or 20 atomic distances and that one would expect that in a 
metal at this distance electric fields are screened off. However this may 
be, we can summarize this section in the following way: there exists 
an extremely simple theory of the normal state when T 0. Experi- 
mental results are on the whole in agreement with the qualitative 
conclusions of this theory although more material is needed. It is far 
from clear why the simple theory should be such a good approximation. 


The phase-transition 


Not only is the behaviour of a metal in the normal state extremely 
simple, but also the phase transition between the normal and the 
superconducting state can be adequately described by a simple 
formalism, at least as long as we restrict ourselves to so called “‘soft”’ 
superconductors like Hg, Sn, etc., and consider only the longitudinal 
case thus avoiding the complications due to deformation of the 
external field by the superconducting sample. If the specimen is not too 
small, B= 0 is a sufficient description of its magnetic behaviour. 
Between the free energy in the normal state F,,, the free energy in the 
superconducting state in zero field F, and the threshcid field H,, there 
exists the relation (all thermal quantities are given in erg/cm?) 


H.2 
F, = F,+ =: 

8r 
The function F,, follows immediately from Eq. (1). Taking the energy 
in the normal state at the absolute zero as the zero point, we have 


F,, = — #AT* — 4BT*. 


Kok [9] was the first to point out a set of relations that hold fairly 
accurately for a number of metals and which are in any case quite 
sufficient for discussing orders of magnitudes. They are based on the 
observation that (a) the specific heat in the superconductive state is of 
the form AT® -++ A,7T°; since the elastic constants in the superconducting 
state are not appreciably different from those in the normal state, the 


On the Theory of Superconductivity 123 


term with A, must be of electronic origin; (b) that the threshold curve 
(giving the threshold field H;, as a function of temperature) is parabolic 
in form. Let 7, be the critical temperature and AU the difference in 
energy between the superconducting and the normal state at the 
absolute zero, then it follows from the assumptions a and b that 


AU = 4BT,? 
3B 
and AL T2 


The critical field is given by 


Gorter and CAsImir [10] have shown that this behaviour of the free 
energy F, can be interpreted on the basis of a crude two-fluid model. 


Let x be a parameter indicating in some way the degree of normality 
of the superconductive state; we may call x the number of normal 
electrons, although this expression has perhaps no rigorous meaning. 
Let us now write 


F(x,T) = — $4x¥?BT? — (1 — x)AU. 


oF 
Putting ao 
_ eel 
gives X=) AG: 


Above the transition point x remains unity, below the transition point 
we have 
T\4 
- (7) 


Although the thermodynamic behaviour of superconductors has been 
known for a long time, it is perhaps worthwhile to emphasize some of 
its consequences. 


First of all the normal state and the superconductive state are both 
possible states of matter. This implies that in the phase-space for 
the metal as a whole the states that are described by Fermi wave- 
functions really exist, although it is possible that a few states corres- 
ponding to individual electrons at a distance from ¢) small compared 
with kT, are missing: this would have no influence on the specific 


124 H. B. G. Casimir 


heat above T,. In a sufficiently large magnetic field only Fermi states 
should be occupied. These states are of course modified by the field 
but since the susceptibility in the normal state is low the influence of 
this modification is negligible. But in any case a theory of supercon- 
ductivity should show that besides the states correctly described by the 
Fermi model there exist other states. These states should not come 
instead of the Fermi states, for then we would no longer be able to 
understand why a superconductor can behave like a normal metal. In 
the second place the thermodynamic formalism gives little hope for 
arriving at an understanding of superconductivity by an application of 
perturbation theory to the Fermi distribution. It is of course always a 
doubtful procedure to try to calculate a new phase by studying the 
influence of perturbations on another phase. For instance if one starts 
by studying the deviations from the ideal gas law one can hardly expect 
to get much information about the crystal structure of the same 
element in a condensed state. We may even say more in general that one 
should always expect a phase transition to be connected with a singu- 
larity in the free energy and hence with a breaking down of a power 
series development. Yet in some cases like ferromagnetism or order- 
disorder transitions perturbation theory does give a certain indication of 
what is going to happen. The fair amount of success of the Van der 
Waals’ equation for describing the condensation of gases is another 
example. But in these cases the interactions have already an appreciable 
influence above the transition point. In such cases it is understandable 
that the behaviour of one or two terms in a series development of the 
free energy will tell something about the occurrence of a new phase. If, 
however, the phase above the transition point behaves in an ideal way, 
it is evident that the transition can only be studied on the basis of 
extremely high order perturbations, in other words a perturbation 
approach becomes completely useless. Further it seems evident that 
such a high order calculation will involve many electrons: the 
superconductive state must be described by a many electron wave 
function and probably show some sort of coherence over extended 
areas. 


We have stressed in the first section that superconductivity does not 
in any way announce itself above the transition point and not even 
above the transition curve. In this last case one has to introduce a 
surface energy between a superconductive and a normal phase in order 
to prevent that regions thinner than a London penetration depth 
persist above the transition curve. Even if remnants of superconduc- 
tivity were found above the critical curve, this would not necessarily 
invalidate our arguments. There was at one time a slight indication 
that the thermo-electric power per degree started to drop a few tenths of 


On the Theory of Superconductivity 125 


a degree above 7;, but these results have not been confirmed by other 
experiments. 


Let us again compare the situation with the condensation of an idcal 
gas. (If we study this condensation at constant volume it even affords 
a good example of a transition of the second kind.) At low tempera- 
tures the vapour behaves as an ideal gas and perturbation theory will 
be useless. On the oiher hand once one knows the appropriate model 
for the solid state, it is possible to calculate the free energy of the solid 
and since the free energy of the ideal gas is accurately known, we can 
then determine the transition by applying thermodynamics. 


Let us next return to the two-fluid theory. Superconductivity seems 
to be an almost ideal example of a transition of the second kind and 
such transitions can only be understood if the system on one side of the 
transition point contains an internal parameter which by its very nature 
cannot exceed a certain value for instance unity. Of course the special 
form of F(x,T) is only one out of an infinity which would lead to the 
same result for F,, but it is the simplest possibility. There is ample 
evidence for the existence of such a parameter x: both the behaviour 
of the thermal conductivity and of the penetration depth reveal a gradual 
decrease of the number of Fermi electrons with a corresponding 
increase of the number of superconducting electrons. Only in the 
electric and—as far as bulk properties are concerned—the magnetic 
behaviour is this not evident, since everything is short-circuited by the 
infinite conductivity and B = 0 anyway. 


It would follow from the foregoing that one can only hope to arrive 
at a theory of superconductivity by making an intelligent guess about 
the superconducting state. If one tries to visualize such a state on the 
basis of a picture starting from independent electron states one is faced 
with a curious situation even if one disregards for a moment the—all 
important—electromagnetic behaviour. On the one hand near the 
absolute zero all the electrons near the edge of the Fermi distribution 
must be involved for there remains no linear specific heat term. On 
the other hand the total energy change occurring in the transition from 
the normal to the superconducting state is small: if we assume that 
the electrons in an energy band of the order k7;, take part the energy 
per electron is kT,. Averaged over all electrons it amounts to 
kT,{T;|T)- 

There is, however, one reassuring lesson to be drawn from these 
thermodynamic discussions. Suppose that wave functions for the 
superconductive state with the right properties were found then we do 
not have to worry about the critical field: this becomes a purely 
thermodynamic question. 


126 H. B. G. CAsIMIR 


The electromagnetic behaviour 


It is generally assumed that electromagnetic behaviour of a super- 
conductor can be described by LONDON’s equations [11], that is by 
adding to the Maxwell equations the equation 
‘ 1 

curl z= — Ke H (4) 
where 2 is the current density and A a new constant. It then follows by 
a straightforward mathematical analysis that for a singly connected 
body in an external magnetic field there exists only one solution. 
Introducing the penetration depth A 


A\i2 
A = C (=| ’ 


we find that as long as the dimensions and radii of curvature are large 
compared with A, the magnetic induction inside the body decreases 
according to an exponential function 


B= B,e~4 


where B, is the value of the induction at the surface and d the distance 
to the surface, measured along the normal. If, however, we consider a 
ring, the solution is no longer uniquely defined but contains an arbitrary 
constant which is to a high degree of approximation the magnetic flux 
passing through the ring. It can thus be said that LONDON’s equation 
describes both the Meissner effect (B = 0) and the infinite conductivity. 
This conclusion, however, requires further discussion. Whereas in the 
case of a singly connected body the one and only solution represents 
the state of lowest free energy as long as the field does not surpass the 
critical value, the lowest state of a ring is that with zero current, and all 
other formal solutions of LONDON’s equations are thermodynamically 
unstable. Since equation (4) can only be expected to hold as an average 
statistical law and is certainly not true for all possible states of the metal 
(among which occur also normal Fermi states), it is not a priori evident 
that fluctuations will not destroy a persisting current. As a matter of 
fact there is one mechanism that will certainly lead to a dying-out of 
the current. There exists a finite, though small probability that by 
thermal fluctuations a certain length of the ring will be raised over its 
whole cross-section to a temperature above the transition point. If we 
try to calculate this effect it is found that the corresponding average 
resistance is extremely small unless we consider extremely thin wires 
immediately below the transition point. In practice this effect is there- 
fore of no importance, but the argument shows that LoNDON’s equation 
does not in itself establish the existence of persisting currents. After all, 


On the Theory of Superconductivity 27 


the smallness of the fluctuation effect just considered is a consequence 
of the value of BOLTZMANN’s constant which does not figure in Lon- 
DON’s equation. It seems further reasonably clear that fluctuations 
affecting only part of the cross-section will have no influence, for there 
remains a superconducting loop to keep the enclosed flux constant. 
Yet, in our opinion no completely satisfactory analysis of this question 
has been given and perhaps this is impossible as long as the true nature 
of the superconductive state is unknown. 


LoNDON has pointed out that equation (4) is obtained if for 
an appropriately gauged vector potential the diamagnetic current 
(e/mc)y*Ay is not compensated by an influence of this same vector 
potential on the wave function and hence on that part of the current 
that depends on the derivatives of the wave functions. In LANDAU’s 
theory of diamagnetism this compensation is almost perfect leaving only 
a small diamagnetism. In LONDON’s philosophy superconductivity 
would require the existence of a wave function which is not changed 
very much by the vector potential. It is well to bear in mind that if 
LoNpDon’s explanation of persisting currents is valid, we have to con- 
sider such stable wave functions extending over a mile or so of dirty 
lead wire. This idea of macroscopic quantum states has also been 
considered in connection with liquid helium. If LONDON’s equation on 
the other hand would only hold in cells of the order of for instance 
10-4cm, then we would still find B = 0 with sufficient accuracy, but 
the existence of persisting currents would remain unexplained. So we 
are faced with the following dilemma: either we have to abandon 
LoNDON’s idea altogether and must look for a separate explanation of 
the infinite conductivity or we have to accept some sort of coherence 
extending over very large dimensions. Such behaviour is unusual in the 
theory of metals—crystals show mosaic structure, ferromagnetics tend 
to break up in Weiss domains—but it may well be a property charac- 
teristic of low temperature behaviour. It should also be remarked 
once more that it would not be sufficient if only one lowest state of the 
metal as a whole would show superconducting properties. After all 
the entropy in the superconducting state is different from zero and 
even at 1°K the expression exp (S/k) is an enormous number. In the 
two-fluid model this difficulty is obviated by distinguishing between 
normal electrons and superconducting ones, the latter group having zero 
entropy. A real theory would have to show the validity of this notion. 


If we accept not only LONDON’s equation, but also LONDON’s 
suggested interpretation, the penetration depth is given by 


mer Na 
da (F=5) 


128 H. B. G. Casimir 


where n, is the number of superconducting electrons. If for 2, we write 
(1 — x)m, where x is the parameter introduced in the phenomenological 
theory and ny the number of superconducting electrons at the absolute 
zero, the variation of penetration depth that is found experimentally 
is fairly well accounted for. However, the magnitude of the penetration 
depth is such that n/m is much smaller than the ratio of the total 
number of conduction electrons to the normal electronic mass. If 
we assume that m is the normal electronic mass, then for Sn only 
0-3 electrons per atom are effective. If we suppose that all valence 
electrons are effective, the effective mass must be 13-3 normal electron 
masses. In any case from evidence concerning the penetration depth 
one gathers the impression that the number of superconducting elec- 
trons is of the order of the total number of conduction electrons and 
not something of the order of (7;,/T,)n. 


Reference should be made here to PiPPARD’s ideas about “‘coherence.” 
PipparD [12] has presented a number of arguments to show that super- 
conductivity is not a local property, but must be connected with some 
sort of order extending over fairly large regions. He has also proposed 
a modification of LONDON’s equation in which the current density at a 
certain point is determined by an integral relation involving the values 
of the field in a certain neighbourhood rather than at the same point. 
There would appear to be no contradiction between this idea and the 
analysis of the phase transition given above. 


Theories of superconductivity 


Let us now consider some of the formerly proposed theories in the light 
of the preceding sections. 


There have been made various attempts to explain the properties of 
the superconductive state by making assumptions about the density of 
one-electron states near the top of the Fermi distribution, for instance 
by postulating an energy gap [13]. Such a distribution, however, can 
never account for the thermodynamic behaviour unless we suppose 
that this gap is temperature dependent, but that means that there are 
an infinite number of possible situations characterized by a gap width A 
and a lowest energy U(A). By making the free energy a minimum at 
each temperature we find A(T). By an appropriate choice of U(A) 
agreement with experimental results may be obtained. Such a procedure 
is only a visualization of a special internal-parameter theory. Also the 
Gorter-Casimir formalism can easily be represented in a Fermi model: 
we have only to suppose that besides the normal Fermi arrangement 
there exist other arrangements characterized by a parameter x such 
that the lowest energy of the metal as a whole is (1 — x)AU below the 


On the Theory of Superconductivity 129 


normal energy and the level spacing is larger by a factor 1/x1/?. Since, 
however, these results are not obtained from first principles we cannot 
make any predictions about the magnetic behaviour of such states. 
One might be tempted to argue as follows: “In normal diamagetism 
the London current is compensated by a current due to second order 
perturbations. If all the matrix elements remain the same and only 
the distance between levels is increased by a factor 1/x’? then the 
perturbation current is reduced by this same factor and there remains 
an uncompensated London current.” But such an argument is not even 
gauge invariant and to add to arbitrary postulates about energy 
distributions equally arbitrary assumptions about matrix elements is 
not a very serious proposition. Since the available evidence indicates 
that the phase transition is essentially a many electron phenomenon the 
value of such a modified independent electron formalism is doubtful. 


There is, however, one case in which a different level density is really 
obtained, viz. the case of a very small particle. The levels are more 
widely spaced; also the total energy goes up. It is easy to estimate the 
average distance between energy levels. We have in a free-electron 
model 

n(e)de = Cede. 


The total number of electrons is 


2 

3 
hence n(é) = 5 ~ 
0 
and the average spacing is de 
aa 


Since &) ~ kT, it follows that de ~ kKT,/n; deviations from the normal 
specific heat are to be expected at temperatures T~T,/n. Small 
particles should also show a large diamagnetism at low temperatures as 
was pointed out by Hunp [14]. There is something rather tempting 
in the idea that due to some sort of interaction the electron gas 
“coagulates”’ into blocks; this interaction should be such that the energy 
at absolute zero of this coagulated state is lower than that of the ideal 
Fermi gas. But there is no experimental evidence of such a cell structure, 
and a special explanation would be required for persisting currents. 
HEISENBERG [15] has tried to make the Coulomb interaction respon- 
sible for superconductivity. He did not succeed in carrying through 
the calculations in a rigorous way. What results is a modified Fermi 
distribution and therefore essentially a thermodynamic theory. The 


130 H. B. G. Casimir 


case against HEISENBERG’s theory is sometimes stated roughly as 
follows. The electrostatic interaction energy is e?/a where a is a distance 
between electrons, therefore it is of the order kT, and cannot be respon- 
sible for the very small energies involved in superconductivity. There- 
fore we should neglect it altogether or in other words: the electrostatic 
interaction is too large, so we put it equal to zero. As a matter of fact 
the existence of a normal state is in itself a sufficient proof of the 
ineffectiveness of the electrostatic interaction but of course it does not 
necessarily follow that there could not remain just enough to cause a 
phase transition at low temperatures. 


FROHLICH [5] and BARDEEN [6] have pointed out that the interaction 
with the zero point vibrations of the crystal lattice leads to an interac- 
tion between electrons. Now it is found experimentally that if we com- 
pare different isotopes the critical temperature is proportional to 
1/M*2; therefore the energy difference at absolute zero is proportional 
to 1/M or to the fourth power of the zero point amplitude. 


However, as in HEISENBERG’s case, perturbation theory is inadequate 
and it is still too early to draw quantitative conclusions. But the 
existence of the isotope effect shows beyond doubt that superconduc- 
tivity is not a property of the electron gas as such but a property of the 
system (electrons + lattice vibrations). 


An interesting comment has been made by SCHAFROTH [16]: he 
shows that one will never obtain the characteristic diamagnetic 
behaviour on the basis of a convergent perturbation formalism starting 
from a normal Fermi distribution: one has to pass through a singularity. 
This tallies with our thermodynamic argument concerning the necessity 
of making an “‘a priori’ guess of the wave functions in the super- 
conducting state. 


The Bose-Einstein model 


In a recent paper SCHAFROTH [17] treats in detail the behaviour of an 
ideal gas of charged bosons. An ideal Bose-Einstein gas has a con- 
densation temperature given by 


nh? fn WOR 
= (x30) 

where 7, is the mass of the boson. Below this temperature the gas is 

equivalent to a number of 


T \8/2 
rans =a 1 —(r) | 


On the Theory of Superconductivity 31 


particles in the ground state, and a normal Bose-Einstein gas containing 
A — Meona particles. The remarkable feature of Bose-Einstein condensa- 
tion is that a sizable fraction of the particles is forced by the statistics 
into the very lowest eigenstate, a state which should have curious and 
essentially non-classical properties. The Bose-Einstein gas is the only 
model known at present that leads to macroscopic wave functions at 
finite temperatures and the super-fluidity of liquid helium has almost 
certainly something to do with this model. SCHAFROTH obtains the 
following results: 


(a2) the condensed Bose gas obeys LONDON’s equation; the penetra- 


tion depth is given by 
Mpc? i 
A se |) 
An. Aeond’@B 


(b) in a homogeneous field, however weak, there is no condensation. 
The threshold field can be determined by thermodynamics. While a 
gas of charged bosons has some rather striking analogies with a super- 
conductor there are also some qualitative differences. In the Bose- 
Einstein model the transition is not of the second but of the third kind 
—there is no discontinuity in the specific heat—and the free energy 
in the normal state is by no means independent of the magnetic 
field. 


SCHAFROTH has suggested that the electrons in superconductors 
might combine in pairs that would behave as bosons, but this idea 
meets with grave difficulties. From the formula for the penetration 
depth and the experimental data for A it follows that if mz is of the order 
of two electron masses the number of bosons at T = 0 is of the order 
of the number of atoms. But from this then follows a condensation 
temperature of the order of T;. We would have to assume an effective 
mass of the order of 10° electron masses to avoid this difficulty. Alter- 
natively we may suppose that the transition temperature of a super- 
conductor is the temperature at which bosons are formed, rather than 
the condensation temperature. But now we are faced with another 
difficulty. 


If it is going to have any sense at all to treat electron pairs as 
bosons the distances inside such a pair has to be at least as small as 
the distances between different pairs. This means a localization to 
within a few atomic distances and such a localization would again be 
connected with an energy of the order of magnitude comparable to 
kT, (although somewhat smaller). So there does not seem any possibility 
to reconcile the idea of electron pairs with the thermodynamics of 
superconductors. 


132 H. B. G. CASIMIR 


We might also consider groups of p sie siouaei The penetration depth 
is independent of p whereas 


Le 1g Th n \28 
v=? roe 


In order to make this equal to the transition temperature p has to be a 
few hundred. This suggests that the bosons might be groups of Fermi 
electrons coagulated by interaction with lattice waves into blocks such 
as discussed in the preceding section. But again it is hard to see how 
this model could give the right behaviour above the transition point and 
a sufficiently small energy difference with the normal state. So J think 
that also this idea has to be dropped. 

Yet it may well be that the notion of a Bose gas will help us to find the 
right approach to a solution of the problem. It is equally possible that 
Bose-Einstein condensation is only one example of a fairly general— 
though by no means universal—feature of quantum statistics. 


Concluding remarks 


We are still far from a theory of superconductivity. And yet there is 
reason to believe that the theory must essentially be simple. Super- 
conductivity is a fairly general phenomenon. It occurs for whole groups 
of metals and alloys. Whether a metal becomes superconducting or not 
may depend on the electron density, on ©, on the coupling between 
lattice and electrons, but this dependence cannot be very critical. 
Superconductivity is certainly not a consequence of some freakish 
structure of electron bands. It is not a property of a few extravagant 
electrons: the whole surface of the Fermi distribution is involved. 
The phenomenology of the superconducting state is simple, much 
simpler than that of liquid helium. 


But for the time being superconductivity remains a challenge to the 
theoretical physicist. And as long as this phenomenon is not under- 
stood an essential element is lacking in our comprehension of statistical 
mechanics and of the nature of the solid state. 


REFERENCES 


[1] N. Bour; Thesis, Copenhagen, 1911 

[2] H. KAMERLINGH ONNES; Proc. Roy. Acad. Amsterdam 19, 1479, 1911 

(3] ibid. 20; 81, 1911 

[4] ibid. 20, 799, 1911 

[5] H. FROHLICH; Phys. Rev. 79, 845, 1950 

[6] J. BARDEEN; Phys. Rev. 79, 167, 1950; 80, 567, 1950; 81, 829, 1951; 82, 978, 
1951 

[7] J. KorRINGA and A, GERRITSEN; Physica 19, 457, 1953 


On the Theory of Superconductivity 133 


[8] A. B. BATHiaA; Phys. Rev. 95, 914, 1954 
[9] J. A. Kox; Physica 1, 1103, 1934 
[10] C. J. Gorter and H. B. G. Casimir; Phys. Z. 35, 963, 1934; Z. tech. Physik 15, 
539, 1934 
[11] cf. F. Lonpon; Superfiuids, Vol. 1, John Wiley, New York, 1950 
[12] A. B. Prpparp; Proc. Roy. Soc. A 216, 547, 1953 
{13) J. C. StaTer; Phys. Rev. 51, 195, 1937; 52, 214, 1937 
[14] F. Hunp; Ann. Physik 32, 102, 1938. See also J. C. SLATER; Phys. Rev. 53, 
208, 1938 
[15] W. HEISENBERG; Z. Naturforsch. 2a, 185, 1947 
[16] M. R. ScHarroTH; Helv. Phys. Acta 2A, 645, 1951 
[17] M. R. ScHAFROTH; unpublished 


THE COMPOUND NUCLEUS 
F. L. Friedman and V. F. Weisskopf 


1. RARELY has a single paper dominated our thinking as has Bonr’s 
address [1] to the Copenhagen Academy in 1936. During the eighteen 
years since its appearance, it has been the decisive influence on the 
analysis of nuclear reactions. 


What was the situation in nuclear physics when it appeared? Only 
a few qualitative facts were then known about nuclear reactions. 
Early work had shown that most cross sections were of the order of 
nuclear dimensions. (For charged particles the nuclear effects were 
even less prominent because of the Coulomb field.) More recently, 
Fermi and his collaborators [2] had discovered much larger cross 
sections for slow neutron reactions in some elements. 


Previous theoretical attempts [3] to explain this variation in the size 
of cross sections were all based on an extremely simple picture of the 
nucleus, the potential well model. According to this model the effect 
of the target nucleus upon an incident particle can be described, at 
least as a first approximation, by an attractive potential. The quantal 


state is given by 
y= P(")yln - + - ra) (1) 


where x is the wave function of the target nucleus and ¢(r) is the wave 
function of the incident particle. It is assumed that x is uninfluenced 
by the interaction, while ¢(r) is supposed to be the solution of a one- 
particle problem in which the particle moves in the potential V(r). In 
the simplest potential well model V(r) is the square well: 


V=~—/YS, tia 


; = 1/3 
V=0 TS} with R PAU. (2) 


As usual, r, is a constant of the order of 10-!% cm and A is the mass 
number of the target nucleus. V, is of the order of nuclear energies, a 
few tens of Mev. 


The predictions of the potential well model are also simple. Only 
two types of reaction can be exhibited: elastic scattering and radiative 
phenomena. The scattering cross section dominates the scene. It is 


134 


The Compound Nucleus 135 


generally of nuclear dimensions, but it attains large values at widely- 
spaced resonances. The spacing for a given angular momentum / is 
about 10 or 20 Mev; and at resonance the scattering cross section is 
approximately (2/ -+ 1)47A?. If a resonance occurs at low neutron 
energies, the scattering cross section attains very large values. 


At first the large resonance cross sections seemed to be an explana- 
tion of the very high cross sections which FERMI and his collaborators 
found with thermal neutrons in some nuclei. However, there are other 
consequences of the potential well model which soon disqualified it: 
(1) The capture cross section is generally very small on this model and 
even at resonance the scattering dominates. In this model the neutron 
spends only a short time within the potential, and hence the probability 
of a radiative transition to a bound state within the potential is 
extremely small. (2) The wide spacing of the resonances reflects the 
most characteristic feature of both cross sections: They change slowly 
with energy. When 2? is divided out, they change but little within 
energy intervals of one Mev or less. This slow change with energy and 
the predominance of scattering even in resonance are typical of single 
particle behaviour. 


Nature disagrees with the potential well model on both counts. Just 
before Bour’s celebrated address, experiments performed by BJERGE 
and WESTCOTT, MOON and TILLMAN, SZILARD, FERMI and others [4] 
showed that neutron cross sections vary greatly within a few electron 
volts: Resonances are both extremely narrow and closely spaced. At 
the same time the cross sections at resonance turned out to be mainly 
capture. 


In 1936, then, the potential well model was due for replacement. 
The narrow close-spaced resonances and the high capture probability 
in resonance required some changes in the description of nuclear 
reactions. In fact, a complete change of view resulted, and it was this 
change that was the subject of Bonr’s address. 


The main concept Bour introduced is the compound nucleus: A 
many-body state formed immediately after the impinging particle hits 
the nucleus, it is the antithesis of the single particle model it replaced. 
The argument runs as follows: The observation of closely-spaced 
resonances in heavy nuclei is an indication that the states formed in the 
reaction are states of a many-particle system. The form (1) cannot 
describe reactions where closely-spaced resonances occur. Level dis- 
tances of a few volts in systems of nuclear size can only occur if a large 
number of particles are involved in the excitation. It is wrong therefore 
to assume that the passage of the incident particle through the nucleus 
does not disturb the state of the target nucleus appreciably. On the 


136 F. L. FrieEpMAN AND V. F. WEISSKOPF 


contrary, in order to explain the excitation of many particles, it is 
natural to go to the opposite extreme. The assumption is therefore 
made that all nucleons forming the nucleus or incident upon it interact 
strongly. 


As a consequence of the strong interaction, it was assumed, the 
incident particle and the nucleus it strikes coa'esce, forming a compound 
state in which all or very many nucleons participate collectively; the 
resonances are the energy values of the quantum states of the compound 
system. These states are not strictly stationary; they have a finite 
lifetime, since they can decay by the re-emission of the incident particle, 
by y-radiation, or otherwise. The width of the resonances indicates, 
however, that the lifetimes of the compound states created by particles 
of low energy are extremely long compared to the straight passage 
time of the incident particle across the nucleus. The states are almost 
stationary and should not differ greatly in their general properties from 
the real stationary states of the compound system at somewhat lower 
energy. 


This new view was extremely successful in describing the low-energy 
neutron experiments. Not only are closely-spaced and narrow 
resonances explained, but also the actual predominance of capture over 
scattering in low energy resonances is easily described. The very long 
lifetime of the compound state makes it possible for electromagnetic 
radiation to compete successfully with other modes of decay. The 
major part of the resonance cross section is thus shifted from scattering 
in the potential well model to capture on the BOnR picture. 


At higher energies the resonances increase in width and start to 
overlap. The observations are easily explained by two facts: First, 
the emission probability of a particle increases with energy, and, 
second, more reaction channels are open at higher energy, each one 
contributing to the total width. Present experimental material and 
reasonable extrapolations indicate that sharp and well-defined reson- 
ances are found only up to an energy of the incident particle of one to 
two Mev except for very light nuclei. At higher energies the widths 
become comparable to or larger than the level distance, and the 
resonance structure is lost. 


2. At the same time that BOHR gave his general description of nuclear 
reactions, more quantitative treatments of resonance phenomena were 
initiated by BReIT and WIGNER [5]. Many generalizations of this work 
have been carried out, the most important ones by BETHE and PLAC- 
ZEK [6] and later on by WiGNeR and his collaborators [7]. Accurate 
measurements during the last two decades have shown that the resonance 


The Compound Nucleus 137 


phenomena in nuclear reactions are well represented in terms of the 
Breit-Wigner formula. The accuracy of this representation is a proof 
of the existence of well-defined compound states; and the properties of 
these states—lifetime, decay probabilities into different channels, etc.— 
can be measured and systematized. 


In the region of overlapping compound levels, the Breit-Wigner 
formula must be extended to include many levels, and the description 
becomes impractical, since the resulting cross sections depend largely 
upon unknown phase relations between the resonances. In order to 
allow some practical conclusions to be drawn about the yield of 
nuclear reactions in that region, the picture must be simplified. 


This simplification is usually introduced by an assumption which 
BOHR suggested in his address. We divide the nuclear reaction into two 
steps, the formation of the compound nucleus and its subsequent decay. 
Then we make the assumption that the decay is independent of the mode 
of formation of the compound nucleus. According to this “‘indepen- 
dence hypothesis” it should not matter what incident particle and what 
target nucleus are used as long as the same compound system is formed. 


In the resonance region, the Breit-Wigner formula factors automa- 
tically into the two stages of Bour’s description, the cross section for 
the formation of the compound nucleus and the probability of its decay 
into a given final configuration. In the region of overlapping resonances 
the factorization is a special assumption. With this assumption the 
cross section of an (a,b) reaction can be written in the form 


o(a,b) = (a) s (T,/ a), (3) 


where o,(a) is the cross section for the formation of a compound 
nucleus by the particle a and (1’,/T), is the probability that the particle 
b is emitted by the compound state. 


This formula can be used for calculating reaction cross sections. The 
cross section o,(a) can be roughly determined by assuming that the 
compound nucleus is formed immediately when the nuclear surface is 
reached. All one has to do is to compute the probability for the incident 
particle to reach the surface of the nucleus, a problem which can be 
solved by simple quantum-mechanical calculation [8]. Next the decay 
probability of the compound nucleus can be determined by reversing 
the process. Because the decay through a given channel is the inverse 
of the formation of a compound nucleus through the same channel, any 
method that allows the calculation of o,(a) can also be used to compute 
the factor (T/T), in [3]. 

The method of calculating reaction cross sections just sketched is 
often referred to as the “‘statistical method.” Some of its conclusions 


Io 


138 F. L. FrrepMAN AND VY. F. WEISSKOPF 


can be suitably expressed by means of thermo-dynamical concepts 
which are based upon Bonr’s picture of the sharing of energy between 
the constituents of the compound system. The excited compound 
nucleus is considered as a heated system and its subsequent decay as 
evaporation of particles [9]. 

The statistical method of determining the yield of nuclear reactions 
gives a reasonable account of their most important features. For 
example, it follows that reactions initiated by protons are weaker than 
those initiated by neutrons by a factor corresponding to the barrier 
penetration. A similar factor is expected to appear in the ratio between 
the yields of two reactions initiated by the same particle, one leading to 
proton emission and the other to neutron emission. Furthermore, the 
average energy of the emitted particles is expected to be small compared 
to the total energy available, the balance being left as excitation of the 
residual nucleus. 


These predictions are qualitatively correct. Reaction cross sections 
as functions of the energy of the incident particle, in particular, the 
yield-energy curves of proton or «-initiated reactions, are reasonably 
well explained [10]. The relative yields of (x,n) and (x,2n) reactions (x 
being neutron, proton, or «-particle) are reproduced in their essential 
features. Also the energy distribution of reaction products is predicted 
fairly well. The spectrum of neutrons and protons emitted from 
nuclei bombarded with neutrons of 14 Mev or with protons of similar 
energy fits approximately the predicted Maxwell distribution of an 
evaporating Compound nucleus, and the “temperatures” are not too far 
from the expected magnitude [11]. 


The success of the statistical method is limited to the qualitative 
description of these salient features. The accumulation of more 
quantitative data in recent years has produced a growing number of 
quantitative exceptions, and certain phenomena exhibit gross features 
unparalleled in the statistical model. Disagreements have been found, 
for example, in the study of the energy dependence of total neutron 
cross sections and in the yield of those reactions in which charged 
particles are emitted. When they are observed with poor energy 
resolution, the neutron cross sections show an energy dependence which 
is vaguely similar to the energy dependence of the scattering at a poten- 
tial well [12], a behaviour which is unexpected on the basis of the 
statistical model; charged particles frequently are emitted with energies 
much larger than the ones expected from evaporation and with an 
angular distribution peaked in the forward direction [13]. 


These disagreements and others, some of which will be discussed 
Jater on, are serious enough to warrant a thorough analysis of the two 


The Compound Nucleus 139 


assumptions on which the statistical treatment of nuclear reactions was 
based: The independence of the decay of the compound nucleus from 
the way in which it is formed; and the immediate formation of the 
compound nucleus when the incident particle reaches the nuclear 
surface. 


3. In examining the assumptions underlying our picture of nuclear 
reactions, we begin with the independence hypothesis. After surveying 
the experimental evidence that can be brought to bear on this assump- 
tion, we shall have a look at its logical foundations. 


There is not very much experimental material available which can be 
used to test this assumption directly. It is not easy to reach the same 
energy region in the compound system with different reactions. Never- 
theless a few statements can be made. 


As long as the energy region reached in the compound nucleus is in 
the resonance region, the independence hypothesis has always proved 
correct [14]. At higher energies the results are more questionable. 
GHOSHAL [15] has shown that the independence assumption seems to 
hold for a compound nucleus of an excitation energy of 15 to 40 Mev 
when produced by protons on Cu® or by «-particles on Ni®*. In many 
other cases, however, there are indications that the assumption is not 
very good. B. CoHEN and his collaborators [16] have pointed out that 
the decay probabilities of the compound nucleus sometimes depend 
quite critically upon the way the compound state is formed. For 
example, COHEN and NEWMAN [16] compare the relative probabilities 
of emission of protons and of neutrons from compound states formed 
either with protons or with neutrons. Nuclei with mass numbers in the 
region 48 to 71 were bombarded with protons of 21 Mev and neutrons 
of 14 Mev. When the reaction is initiated by protons, it turns out that 
proton emission is more probable than neutron emission. 


Furthermore, the independence assumption is indirectly attacked by 
other evidence. In Bour’s picture, independence is related to the idea 
that the energy is shared by all particles and that the direction of inci- 
dence or position in the nucleus of the particle originally initiating the 
reaction becomes irrelevant in the compound state. In fact, as was 
mentioned, one often finds reaction products emitted with a much 
higher energy than one would expect if the energy of the incoming 
particle were shared among all constituents; and those reaction 
products often show an angular distribution (mostly forward) with 
respect to the direction of the incident particle. In these cases the com- 
pound state has more memory of the initial process than seems generally 
compatible with the independence hypothesis. 


140 F. L. FRIEDMAN AND V. F. WEISSKOPF 


What logical justification is there for the independence assumption, 
and what are the reasons for its possible breakdown? If the energy of 
the incident particle is within the region of sharp and well-defined 
resonances and if the energy coincides with or is near to a resonance, 
the assumption is obviously justified. The nuclear reaction then pro- 
duces only one quantum state of the compound system. The properties 
of a given quantum state evidently are independent of the way it is 
produced*. The validity of this conclusion is limited only by the fact 
that the resonances are not stationary states in the strict sense because 
of their finite width. Actually, because of the overlapping of the wings 
of neighbouring resonances, it is never exactly true that only one 
quantum state is realized. However, the deviations are small and of the 
order of level width to level distance. 


In the region of considerable overlap of resonances the validity of the 
independence assumption is by no means obvioust. In this case an 
incident particle of fixed energy excites several compound states, and the 
relative phase of the states will depend upon the nature of the excitation. 
Hence, if the same compound nucleus is excited to the same energy by 
different processes, one would expect different phase relations between 
the compound states and different modes of decay, since the emission 
probabilities of a linear combination of states depend upon their phase 

elations. Thus in the region of overlapping compound states the 
independence assumption is no longer trivial; on the contrary, it is 
in need of justification (of a different justification from that of an 
isolated state). Otherwise, it is invalid. 


When the density of overlapping states is very large and many com- 
pound states are excited simultaneously, a new situation may arise. So 
many states and, therefore, so many phases are involved in one reaction 
that, even though the relative phases are determined by the excitation 
process, they may act in respect to the decay process as if they were 
random. It is then possible that the second stage of the nuclear reaction 
appears independent of the first in the region of strong overlap. 


That this possible independence should indeed be realized was made 
plausible very early in the form of a quasi-classical argument to which 
we have already alluded. It runs as follows: The energies for which 
overlap of resonances occurs are high enough that classical considera- 
tions may be significant. Using classical considerations alone, we argue 
first that the incident particle and the constituents of the target nucleus 


* Such a state may show directional ‘“‘“memory,” but not forward asymmetry. It has a 
symmetry plane at right angles to the incident direction. (Polarization of the original 
beams or targets is assumed absent.) 

{ Substantially, the following arguments were presented in 1939 by Bour, PrIERLS and 
PLACZEK [17]. 


The Compound Nucleus 14] 


interact so strongly with each other that, in a time short compared to 
the free passage time of the particle through the nucleus, the energy is 
distributed among a large number of constituents. Second, since the 
particles are constantly and rapidly exchanging energy, a state of statis- 
tical equilibrium is reached before the compound state is broken up; 
and finally, because decay takes place from a statistical distribution, 
the probabilities of the different break-ups do not depend upon the 
nature of the initial delivery of the energy. 

The quasi-classical picture also contains an explanation of how 
exceptions to the independence assumption arise. The picture is only 
valid if the energy of the incident particle is high enough so that 
classical reasoning may be employed. On the other hand, because high 
incident energy gives rise to relatively short-lived compound states, it is 
not certain that there exists any energy region for which classical 
reasoning is applicable and in which the lifetime is long enough so that 
statistical equilibrium is reached before break-up. 

Not only does the time available for reaching equilibrium fall with 
increasing energy, but the interaction cross section between the incident 
particle and the nucleons becomes smaller, hampering the establish- 
ment of thermal equilibrium. Within nuclear matter the mean free path 
of nucleons of 100 Mev or over becomes comparable with the nuclear 
radius, and some nuclear reactions at that energy have been successfully 
described by assuming that an energetic nucleon passing through the 
nucleus interacts with individual nucleons rather than with a many-body 
system [18]. In such a reaction the details of “‘formation” are all 
important. 

At intermediate energies where the interaction is still strong, the 
extreme description of nuclear reactions as individual collisions, 
nucleon by nucleon, will not hold. Nevertheless, it is not necessary to 
return to the opposite case of states weighting all nucleons equally. 
Something in between is reasonable; and one can easily imagine 
mechanisms that might lead to nuclear reactions in which the break-up 
of the compound system is dependent on the mode of formation. 

For this intermediate energy region, BETHE[19] suggested the 
possibility of a “‘spot heating’ effect. When the incident particle 
impinges upon the nuclear surface, its energy is first shared by the 
nucleons surrounding the point of contact*. This small region of the 
nucleus acquires a rather high temperature; and it may emit a nucleon 
immediately with an energy much higher than that expected if the total 


* All localizations in these quasi-classical discussions are defined only with an accuracy 
of one wave length, which is small compared to nuclear dimensions for energies of 10 Mev 
and over. 


142 F. L. FrteEpMAN AND V. F. WeElISSKOPF 


energy were distributed over the whole nucelus. Somewhat different 
from but related to spot heating is the transfer of energy dircetly to a 
single nucleon when an incident particle grazes the surface. In both 
cases the energy of the ejected particle will be large, but spot heating 
corresponds crudely to the backward particle emission and grazing 
collisions to the forward part of the angular distribution. Another 
possible mechanism of this type would be the setting up by the initial 
impact of a surface deformation which then travels around the nucleus 
and is focused on the opposite side where it might then give rise to the 
detachment of a particle. The reactions produced by these mechanisms 
are all examples in which the independence assumption breaks down. 


In his original paper Bor carefully pointed out that the compound 
nucleus and the two-step analysis of reactions would not provide an 
adequate picture for reactions involving very light nuclei. Deuteron 
stripping and pickup reactions appropriately fall in this class even 
though the idea of a compound nucleus involving one of the nucleons 
transferred in the reaction is a useful concept in the detailed theory of 
these reactions. As the complete compound system often does not form 
in these reactions, they play a multiple role in this classification falling 
alongside spot heating and grazing in their failure to share the incident 
energy statistically. 


It is interesting to re-analyze the independence hypothesis in the 
following terms: The quasi-classical and the quantum description of 
the compound system are related by the correspondence principle. In 
particular, the average energy spacing D of the quantum states in 
systems of simple periodic character corresponds to the period 


20 7 ; : 
ay of the motion. In complicated systems, one cannot easily 


define the period of the motion; however, the time 7 = 27A/D still 
plays the role of the time interval in which the classical motion has gone 
through all those configurations which are compatible with the initial 
conditions: after a time of the order 7 the configurations will all be 
similar (within the uncertainty relations) to configurations which were 
realized before [20]. 


In the region of strongly overlapping compound states, the width T° 
of the states is much larger than the distance D, which means that the 
lifetime 7,:- h/I’ is much shorter than the characteristic repetition 
time vr. In other words, the system has not enough time to go through 
all configurations compatible with its initial energy, angular momentum, 
etc. It would go through all these configurations, if, by some device, all 
channels of decay were closed. Further, if the channels leaked so slowly 


The Compound Nucleus 143 


that all configurations were passed through many times, the decay of 
the compound system would be independent of the exact starting point. 
Both decay time and decay probabilities would be independent of 
formation. With these channels open, however, the lifetime becomes so 
short that only a few of the configurations can be realized, and when the 
system passes through few configurations, just which ones are realized 
may depend critically upon the initial conditions. Therefore the decay 
of the compound nucleus may depend upon the way it was formed. 


Again, we find that, if the independence hypothesis is valid in the 
region of overlapping, it must be true despite the fact that different 
configurations are realized for different modes of formation. Since the 
probabilities of decay are general features, it is possible that they may 
not depend on the detailed configurations; they may come from 
averages which wipe out some of the details. Nevertheless, above the 
energy region of sharp separated resonances, the independence 
hypothesis is insecure. 


In general, then, the independence hypothesis is a very far-reaching 
assumption. Its significance is different in the region of sharp resonances 
and in the region of overlapping widths. In the resonance region the 
hypothesis is obviously valid, since it follows from the fact that well- 
defined quantum states are created as indicated by the sharp resonances. 
At higher energies, however, its validity is doubtful. There is a middle 
region where the widths of the resonances are of the order of their 
spacing in which the independence hypothesis will not hold at all. 
There only a few component states are simultaneously excited and the 
phase relations necessarily will be typical of the mode of excitation. 
Finally, in the region of strong overlapping the independence assump- 
tion might be valid again, at least in some cases. In these cases its 
validity represents a limiting case of statistical disorder reached in the 
compound state before its break-up; and, since decay times decrease 
rapidly with increasing energy, it is not certain whether these chaotic 
conditions ever are reached. 


It is, therefore, understandable that nuclear reactions have been 
found for which the two-step description with independence of decay 
from formation of the compound system is not applicable. Such 
nuclear reactions in which the properties and probabilities of the 
products are closely related to the details of the initial configuration 
are to be expected. Fruitful idea though it was, independence is too 
drastic a simplification. 


4. We now turn to the second hypothesis used in the statistical method 
of calculating nuclear reactions, the assumption that a compound 


144 Ff. L. FrreEpMAN AND V. F. WEIsskopPF 


nucleus is created when the incident particle reaches the nuclear surface. 
There cannot be any doubt that compound states are sometimes created 
by the incident particles; the observation of closely spaced resonances 
is the best proof. The remaining question is whether such many-particle 
states are always formed “immediately” after the particle enters the 
nucleus. 


The assumption of immediate formation was based upon the classical 
picture of a particle impinging upon a system of strongly bound 
constituents. The strong interaction energy between nucleons seems to 
lead necessarily to a quick exchange of energy. The recent success of 
the shell model in describing the properties of the lowest states of the 
nuclei casts some doubt upon this conclusion. The evidence revealed 
in the light of the shell model shows that a nucleon can move com- 
paratively freely within the nuclear volume. Such a nucleon seems to 
possess an angular momentum of its own and to move in a well-defined 
single particle orbit. 


As yet no satisfactory explanation of this behaviour exists. The 
apparent contradiction between single particle orbits and the strong 
interaction observed in the scattering of nucleons by nucleons might be 
explained in two ways: either the forces between nucleons are much 
weaker when they are within nuclear matter, or the configurations in 
low nuclear states are such that they are describable by independent 
particle orbits in spite of the strong interactions. 


Not very much evidence has been found so far in support of the first 
explanation. Indirectly, perhaps, our present difficulties in understand- 
ing the saturation of nuclear forces on the basis of the free nucleon- 
nucleon interaction speak in favour of some change in the internucleon 
potential when the nucleons are closely packed. However, it seems that 
small changes would be sufficient to explain saturation, for example, 
the introduction of repulsive three-body forces, or a large repulsive core 
in the potential between two particles [21]. These changes are not in 
themselves sufficient to guarantee the free particle behaviour of nucleons 
in nuclear matter, since they do not exclude strong interactions between 
two neighbouring nucleons. More drastic changes have been suggested, 
which would make the free particle motion obvious. An example is a 
non-linear behaviour of nuclear potentials. A saturation value of the 
potential is assumed to be created by a high density of particles [22]. 
If this saturation value is reached within nuclear matter of normal 
density, the nucleons fail to interact, even if they approach each other 
closely. 


The second explanation has not yet been formulated in a satisfactory 
form. The free particle behaviour in low excited states might perhaps 


The Compound Nucleus 145 


be understood even in the presence of strong interaction by invoking 
the exclusion principle, which forbids momentum or energy transfer 
from one particle to the other because the states into which the particles 
could be scattered are all occupied [23]. However, until it is proved that 
the free particle states are indeed the lowest, this reasoning is not much 
more than a plausibility argument. 


In spite of the absence of a satisfactory explanation, there is no 
doubt that the low-lying nuclear energy states can be described sur- 
prisingly well by a system in which nucleons move with little interaction 
in a common potential well; and the many facts thus summarized 
require a reappraisal of our idea of the formation of a compound 
nucelus by an incident particle impinging on a target nucleus. 


In attempting this reappraisal, we may be guided by some of the 
discrepancies between the experiments and the statistical model. For 
example, as we mentioned, the large-scale energy dependence of the 
total neutron cross sections disagrees with the statistical theory. The 
statistical model for neutron cross sections assumes that the compound 
nucleus is formed immediately when the incident neutron reaches the 
nuclear surface; and the cross section for reaching the surface turns out 
to be a monotonically decreasing function of the energy, going as E~!/? 
for small energies and reaching the asymptotic value 27R? for large 
energies [8]. The experiments do not agree; the observed cross sections 
show a more complicated behaviour which seems to point to some 
combination of single particle and compound nucleus. Looked at under 
the microscope of high resolution experiments, the cross sections show 
the narrow peaks which led to the Bour theory. Looking more 
broadly, BARSCHALL and his co-workers [12] have plotted the observed 
total cross section in a three-dimensional plot against energy E and 
atomic number A. This plot exhibits systematic regularities with 
maxima and minima at values of E and A, where the old pre-Bohr 
theories of the simple potential well would have placed them. 


Although these large-scale maxima are not as strongly pronounced 
as the-simple well theory says, they are entirely unexpected on the 
statistical model. They strongly suggest a partial return to the old 
potential well. Obviously, since slow neutron experiments show that 
compound states are formed in which the energy is distributed over the 
nuclear constituents, the old one-body description cannot be exactly 
valid; but an intermediate situation between the pure one-body 
treatment and the immediate formation of the compound nucleus may 
be envisaged. In this intermediate theory, for some circumstances at 
least, the motion of the incident particle within the nucleus should be 
approximately the same as a single particle moving in a potential well. 


146 F. L. FriepMAN AND V. F. WEISSKOPF 


5. One attempt to combine single particle and compound nucleus 
pictures is embodied in the optical model of the nucleus [24]. This 
model describes the effect of the nucleus on the incident particle by a 
potential well, — V,(r), but allows for the possibility of formation of 
the compound nucleus by adding to the potential a negative imaginary 
part, — iV,(r). This part produces an absorption of the incoming 
wave within the nucleus, and this absorption is supposed to represent 
the formation of a compound nucleus. As V,; measures what is taken 
out of the single particle description rather than what goes into any 
other particular mode, our use of the term “formation of the com- 
pound nucleus” is broad here. It includes not only processes in which 
the particle shares its energy with all nucleons and forms a compound 
state in the orthodox sense, but also processes of the kind described in 
§ 3 in which the particle interacts with parts of the target only. The 
essential point is that it represents any process which removes the 
incoming particle from the entrance channel. 


On the optical model, compound formation docs not take place 
“immediately” nor with complete certainty. Even if the incident 
particle has entered the nucleus, it is removed from its free particle 
state only with some delay and a certain probability. If V,(r) and V,(r) 
are reasonably constant over the nucleus, one can define a coalescence 
coefficient; it is the probability per unit length of path for the incident 
particle in nuclear matter to form the compound system. For an inci- 
dent particle of energy E, the coalescence coefficient is given by 


m 1/2 : 
K= Fecal 2(V4/h). (4) 


(K)~! represents the distance which the particle must traverse within 
the nucleus before the compound formation occurs with appreciable 
probability, and $(//V,) is the mean time before coalescence takes place. 
On this picture, if an incident particle gets into the nucleus, it is reflected 
back and forth approximately as in the old potential well model before 
escaping or coalescing. Where the old well has virtual states, they have 
been re-introduced as a precursor to the final compound nucleus; and 
the nuclear reaction may be thought of as proceeding in two stages, a 
brief formation of a single particle state followed by escape or 
coalescence. 


In discussion some years ago when the evidence for the shell structure 
was accumulating and some of the inadequacies of the compound 
nucleus picture were becoming more apparent, BOHR suggested that 
these new insights would give us a more detailed picture of nuclear 
reactions, and, in particular, that one should investigate the possibility 


The Compound Nucleus 147 


of introducing earlier stages into the picture of the reaction before the 
formation of the eventual compound state. The intermediate stage 
introduced by the optical model into the picture of a nuclear reaction, 
the stage during which the single particle is inside the nucleus, is 
probably the simplest realization of the stages suggested by Bone. 
During this stage the optical model automatically combines the reflec- 
tions of the waves at the edge of the potential well with the coalescence 
inside the nucleus to tell us how often the compound system is formed. 


With suitable simple assumptions regarding the form of V, and V, 
the effect of the potential 


(meme) HU) 


on an incident beam of particles can be determined. The scattering and 
absorption cross sections, o;; and o;’, are thus obtained as functions of 
the energy. The absorption cross section oj’ of the model should 
represent the cross section for the formation of the compound nucleus 
in real life. The scattering cross section 0; derived from the model 
must be related to the elastic scattering. By comparing the cross 
sections calculated from the model with experiments one hopes to be 
able to determine V(r). 

When the observed total neutron cross section is averaged over an 
energy interval large enough to smooth out any narrow resonances, it 
is surprisingly well represented by the complex potential model. With 
the simplest assumption of a square well for both V, and V,, one can 
reproduce the characteristic maxima and minima of this average cross 
section as a function of energy and mass number, covering all A and an 
energy range from zero to a few Mev. The following potential gives the 
best fit: [25] 


V, = 40 Mev 
fOr r <aak 
1 Mev < V, < 2 Mev 
and (5) 
Vo = ae four > R 


with 

R= 1456 1088 AM, 
We conclude therefore that the mean free path for compound nucleus 
formation of a slow neutron entering the nucleus is about 


b= 2210 cm. 


The neutron does not form a compound nucleus immediately upon 
entering the target nucleus; in fact, the probability of formation is 
about 0-30 for one trip across a medium-sized nucleus. 


148 F,. L. FrrepMAN AND V. H. WEISSKOPF 


6. That such a simple model as the complex potential fits the experi- 
mental data so well is perhaps a bit surprising. In order to see whether 
or not the correspondence can be understood, it is necessary to examine 
the relation of reality to the optical model more closely than was done 
above. In particular, just what the formation of the compound nucleus 
means needs more careful consideration. 


In the optical model, compound nucleus formation and absorption 
are the same and represent the removal of the particle from the entrance 
channel. Reality is certainly more complicated. Among other compli- 
cations, the compound nucleus can emit the incident particle back into 
the entrance channel. 


This re-emission does not lead to any trouble in the high-energy 
region. If the incident particle arrives in the continuum rather than the 
resonance region, so many channels are open for decay of the compound 
system that the probability of re-emission is negligible. Compound 
nucleus formation and absorption can therefore be equated. In this 
respect “‘reality” and the optical model are the same. 


On the other hand, if the energy of the incident particle is low, the 
reality of closely-spaced resonances does not correspond with the 
smooth energy dependence of the optical model, and the existence of 
resonances must be expressed by boundary conditions on the incident 
wave which are violently different from those of the optical model. 
(For example, the conditions for resonance require zero slope of the 
external wave function at the nuclear radius [26].) Also fewer decay 
channels are open; and we can no longer be sure that re-emission into 
the entrance channel is negligible. Consequently, compound nucleus 
formation can no longer be equated with absorption. 


In the resonance region, then, there are conceptual difficulties in 
relating the optical model to the actual state of affairs. In order to 
resolve these difficulties, it is necessary to carry out some kind of 
averaging over the resonances. Such averaging is required to wash out 
the sharp bumps. It also can be used to overcome the difficulty in 
relating compound nucleus formation to absorption when there is 
re-emission into the entrance channel. Finally it will supply us with a 
resolution of the paradox represented by the difference between reson- 
ance boundary condilions changing rapidly with energy and the slow 
energy variations of the optical model. 


How the averaging over resonances arrives at a consistent relation 
between “‘reality” and the optical model can be seen in two ways—one 
somewhat mathematical, the other relatively anschaulich. Because the 
more mathematical method allows us to make a reasonably concise 


The Compound Nucleus 149 
mame of tee réion. we shell wee it fire and ihen irv to see into 
the reeds thvewen the meore intulliwe Giscussion. 

Fer tamplaut®. we rosersct tac Guecussion Lo Rewlron reactions with the 


i= 0 partial wave. A wave function 


a , 
ea ea ap”) - (6) 
ommese Ue Tome: A of interacison with ihe nucleus is then sufficiently 
gemeral Wf « & & Tumction of the ucident emerey £. a function which 
narese fapediy an F penect through a resenamee. The cross sections are 
rélated to 7 as follows [26]: 
WT 
12 
Ger = 7a ll — 4), 
= 5 (1 —\n/%, | 1) 
ke 


and Owe = 73 1 — Mn)), 


wnrese Ziv) means the real part of %. These are the genuine elastic 
sattering. reaction, amd iotel cross seculoms for that precise energy E 
Of whe incident newsrons which corresponds to the wave number k: i.e. 
E = (hkP 2m. 

We wow average these cross sections over an energy interval J which 
vuraim many resopances. The average is defined for convenience so 
that k*c ix averaged rather than c, but when the mean energy is great 
compared with the interval /, the k2 can be factored out; and only at 
very how energies is some special consideration needed. With this 
caveat, we define 


1 
and 6= IAE) <k*o>. 
From these definitions we obtain 
6 = G0 — Ag) = BU — (QA?) — on (8) 


150 F. L. FrigEDMAN AND V. F. WEISSKOPF 


and <|An|*> = <|y|®> — |<1>|? is the mean square fluctuation in the 
interval J of the coeflicient of the outgoing wave. Also 


Fon = G3 201 — AKN)), 


which depends only on <> as it must because o,,, is linear in 7. 
(G,, however, depends on <|y|*>.) We call o,, the fluctuation cross- 
section. 

In making the correspondence between reality and the optical model, 
we must identify the total cross section found from the model, o%,, with 
the average of the “‘real” total cross section at precise energies. Part of 
our correspondence is therefore 


OF, & Fay = 75 Al — An). ©) 


To complete the correspondence, we wish to identify the absorption 
cross section of the model with formation of the compound system. 
The absorption on the model corresponds therefore to the sum of the 
reactions and that part of the elastic scattering which occurs by decay 
of the compound system through the entrance channel. The analogous 
cross sections in a resonance theory are 


On = Oy + oe 
where o, is the cross section for compound formation, and (on the 


right) it is expressed as the decay into reaction cross section o, and 
compound elastic scattering, o,,. Performing the averages, we take 


Og = G, = G, + Gee. (10) 


We now proceed to the determination of o,. Using (8) and (10), we 
find 


5, = ga (1 — |Xn>|) — on + See (11) 


At this stage we have no specific representation for G,,, but we have the 
following information about it. At high energies G,, vanishes because 
of the competition with many modes of decay of the compound system. 
At the same energies we expect 7 to be smooth, and therefore the fluc- 
tuation o,, also becomes negligible. In this energy region therefore 


6, © aa — KKn>|). | (12) 


At lower energies both o,, and 6,, increase. We shall see that (12) 
remains valid and that the two cross sections o,, and G,, are equal. To 


The Compound Nucleus ile 


examine the validity of (12), in the low energy region, we go to the 
extreme case of well-separated resonances (I< D). We introduce the 
necessary extra information—a resonance theory—in the form of an 
approximate 7 good at an isolated resonance occurring in the compound 
system for the incident neutron energy E£,: 


— pi iD, : 
Taw = € (1 = pee)? G3) 
where 6 is a slowly varying phase changing only over the single particle 
energy spacing and therefore irrelevant as long as we stay within a 
reasonable energy range. For this special case, as we see by putting 
Npw into (7), the reaction cross section is 

_7 T@-T) 
"h(E — EP + (1/2) 
in agreement with the Breit-Wigner formulae 

7 |W 
R(E— EY + (P/2” 
1B 


a and Oe = Oe Tr 


Oo 


Ls == 


Upon averaging, we then obtain 


S a 20 
On == 2 
cei 
where = ye > T@ is the average neutron width divided by the 
average spacing: and 
ha a 20 
Cee = k ah. yy OI) ee (14) 


We compare the result for 6, with the value of = (1 — |<y>|?) by 
putting in “py for 7. We get 


Tr 
“NEw? ay aed: (1 Says 3) 
and therefore 


k2 = (1 — {Knew = k2 ae = ( —= al 


Because I’, <I'<D, the last term in the brackets is negligible. 
Consequently, Eq. (12) is valid even in this extreme case. (What is 


S52) fF. L. FrigpMAN AND V. F. Weisskopr 


neglected in making the cquality is of the same order as the accuracy 
of the approximate resonance theory we have used.) This result is 
equivalent (according to (11) ) to 


O71 — Sees 


an equality which also can be established directly by evaluating 
<|An|?> to determine o,, and then comparing the result with 6,, as 
given by (14). 

We are now in possession of the complete correspondence 


of, = 7521 ~ RK) 
o? = (1 — Kl 


and consequently of = fe |1 <a 


The whole answer can be abbreviated: 
n” = <>. 


A little more light may be shed on the identification of the 7°” with 
<n>, by considering the scattering averages, 


G4 = 7a <|1 — | 
= gall —<n>l? + za <[An)>. 


Therefore Oy =6,3 > 87) — a {1 =a <n>|*. 


This is the smooth, non-fluctuating part of the scattering, sometimes 
called the shape elastic. Since 6,, = o,, and off = G,,-— G,,, we again 
find that this shape-clastic scattering corresponds to the scattering of 


the model Giienog = S |! — <>|*). The optical model is therefore 


connected to the averaged problem in which the fluctuations are removed 
enlirely by transferring them from scattering to absorption: 


Of = F, — G, and of = G, + Oy. 


The validity of this correspondence depends on the equality of fluc- 
tuation with compound elastic scattering. 


The Compound Nucleus 153 


The crucial powmit in our ar 
More imsight into this “ye 
behaviour of 2 meutron wer ea leeris 
Wave packet comes out late fom yared to the rest. thee agentifyime scl? 


with the part of Use weve function which wes delayed dy wrming the 
manep pein « sates. Basedily the difference im time dehavsour, = at the 
scattering awecciated wih a F { ¥ from 


n 

var definitions. Gy. Was doug 
“RD: Gy, bad to take up th 

Shape-ciastic scaliering must 

wave Dackel in the old sing 

seaticring on the other han : 

A An near a compound resonances 


behaviour of the refiected packet. 
Explicitly, we write (6) in the form 


In this wav we put 


the first term. nad the festa SOL 7 3! i 2 ping 
the last term would civé 2 mod! in which the scattering ko, and the 
absorption ,: It would leave only the part we have identified with the 


optical model. 

In conformity with our awerage ower an enerey interval / ereat 
pared with the Jevel spacing D. we now build 2 wen 
phase waves (13). the incoming part of which pas 
time 7 ~ fij/. We can then examine the me beheviour o g 
parts of the wave packet as they pass a given radius. locking seperaicly 

, An i a. '* , : 

at the shape elastic term — 4 €*” and at 7 (4 — 4 ye". Seace the 
average walue “7, is constant over the energy interval of the pulse. 
scattered pulse pain to the first of these two terms hes the 
same shape as the incident puise and will emerge immediately as t 


incident pulse sweeps over the nucleus. 


The second term, corresponding to the fluctuation, is the more 
interesting one. Its time dependence is given esseniially* by 


fd =| (9 — Gaye MAE. 
* The wave packet is observed going in and out reasonably close w the nucicus so that 


it has no time to spread during its motion. 


Iz 


154 F. L. FrreEpMAN AND V. F. WEISSKOPF 


On putting in for 7 the approximate Breit-Wigner yyy of Eq. (13), we 
can evaluate this time behaviour. For t > 7, we obtain, 


TMt+pen 


FO? = es Te) re) giE.—Evytih oh ? 16) 
§,8 


: 2h 
In (16) all the periods Po a per are of the order A/I* = 7%. Hence 


| f()|? decays in a time of the order 7°. 

We may recall the inequality A/[) SS h/D S h/I. (That is, the life- 
times 7 of the compound system are long compared to the internal 
repetition time 7, which, in turn, is much greater than the length T of the 
wave packet.) Consequently, 73 S> 7. We see therefore that the emis- 
sion of the fluctuation wave packet takes place almost entirely after the 
time T, that is, almost without any interference with the shape elastic 
scattering. As expected, the fluctuation gives elastic scattering which is 
delayed and which arrives (according to (16) ) with the decay periods of 
the compound states. Our interpretation of o,, as o,, is reinforced by 
the time behaviour*. 


We can now summarize the progress of a nuclear reaction as follows: 
When the incident wave packet reaches the nucleus, a large part of the 
pulse is scattered by the nuclear potential well, and a scattered pulse of 
roughly the same form as the initial one leaves the nucleus at once. At 
the same time—that is within the coalescence time f/V, on the optical 
model—part of the incident pulse forms a compound nucleus. Because 


* In (16) we may split the sum into >} (T,,()%e ~POeh + >’ where X’ contains all 

& a’ <3 
the cross terms between the various resonances. In addition to the exponential decay, 
each cross term contains the oscillatory time factor cos (E, — E,,)r/i. In special circum- 
stances, these terms may interfere at certain times to make a large contribution to [f(#)|?. 
On the time average, however, they are small compared to the diagonal (s’ = s) sum as 
long as 1 < D. Crudely speaking, therefore, 

_fPt) 
LAA? ~ eee 

$ 
i.e. | f(t)|? shows the decays of the separate compound states. 

It is interesting to note that as a function of energy the compound elastic wave packets 
stretches over the energy interval J, but it is concentrated, as one would expect, in narrow 
energy bands at the energies of the resonances of the compound system. This concentra- 
tion is related to the long times 7,° during which the compound elastic scattering comes out. 

We also remark (still for T'/D <1) that in 

1 ; (T,*)? 
7 Sinoita = (BSR + § Dar) 


the second term is negligible, so that 
1 1 IRE) 2 
yp IG) at : at 


in agreement with (14) for O¢. 


The Compound Nucleus 155 


this compound forming part returns to the channel entrance only after 
the time 7, it has no influence upon the original scattered pulse. 


Here we are using the fact, discussed on page 142, that + = 27h/D 
represents a “time of revolution” in the compound state; it is of the 
order of the time it takes for the pulse to “‘reappear”’ at the entrance 
channel after having produced a compound state. As + > T, the reap- 
pearance at the entrance channel is long after the original scattering has 
taken place. Consequently the original scattering of the pulse cannot 
be altered by the establishment of any special boundary conditions, or, 
in other words, the internal pulse which forms the compound system 
cannot interfere with the original pulse to modify the immediate 
reflection. Also the re-emission of the particle by the compound 
nucleus into the entrance channel will take place much later, after the 
time fi/I’, and can therefore be distinguished from the scattering that 
takes place immediately. Hence, for a pulsed initial neutron beam, the 
compound formation can be clearly separated and appears as an 
absorption of part of the incident pulse in spite of the fact that re- 
emission may occur later on. Finally, turning the argument around, 
since the pulse time 7 must be short compared to the repetition time 
7 = 2nh/D, the energy spread of the incident beam necessarily must be 
large compared to D. The cross sections which are defined by this 
consideration therefore must be averages over the resonances of the 
compound nucleus. 


7. The predictions of the complex potential model compare fairly well 
with the experimental results. We shall classify the experiments into 
the following three groups: total cross-sections, elastic scattering data, 
and cross-sections for the formation of the compound nucleus. 


Total cross-sections. As indicated before, the prediction of the 
total neutron cross sections shows a surprising agreement with the 
measurements, even when a simple square well is used for the potential. 
This agreement is especially significant in the low energy region, 
between 0 and 2 Mev, in which the total cross section, when averaged 
over individual resonances, exhibits pronounced maxima and minima. 
These characteristic features are very well represented by the calcula- 
tions when the constants (5) listed above are used. The calculation is, 
moreover, reasonably sensitive to changes in V,, V,, and R, so that the 
fit determines their values rather precisely [25]. 


At somewhat higher energies the total cross sections become less 
sensitive to the constants. They are roughly approximated [8] by the 
formula o,,, = 27(R + 4)? and have very little dependence on the 
imaginary part of the potential. Nevertheless, the observed deviations 


156 F. L. FrigEDMAN AND V. F. WEISSKOPF 


from this form might be used to get some information regarding the 
constants of the potential well. At present the main obstacle to a 
determination of V, above 4 Mev is of a mathematical nature. It is 
difficult to calculate the scattering by a potential well whose dimensions 
are large compared to the wave length, and whose depth is too big for 
the use of the Born approximation. 

For scattering at much higher energies the first application of the 
optical model was done by FERNBACH, SERBER, and TAYLOR [24]. This 
important work was actually the first systematic attempt to represent 
the effect of the nucleus by a complex potential. Recently T. B. 
TAYLOR [27] has extended and refined the method and has applied it 
to the scattering from 40 Mev up. He finds that the experimental 
results near 40 Mev can be reproduced by a potential well which is 
slightly less deep than (5); also the depth of the well must decrease 
further with increasing energy. The imaginary part of the potential is 
considerably larger than the one found at low energy, but it also 
decreases towards still higher energies (see Fig. 1, p. 159). 

Elastic scattering. The scattering cross section of? and the absorption 
cross section, 07, calculated on the basis of the complex potential 
model are somewhat more difficuit to compare with the experimental 
data. The scattering on the optical model does not include the com- 
pound elastic scattering which plays an important role at energies 
below | or 2 Mev. Similarly, the calculated absorption represents the 
formation of the compound nucleus, and it too has not been measured 
directly, since it contains all possible reactions including the compound 
elastic scattering. 

By estimating the compound elastic scattering, however, some con- 
clusions can be drawn from presently available experiments. With a 
plausible estimate of I',/T (ie. of o,,/o,), the same simple complex 
potential which represents the total cross section between 0 and 3 Mev 
also reproduces the observed angular dependence of the scattering at 
1 Mev [28], [25]. 

At higher energies, where the compound elastic scattering is sup- 
pressed by competing processes, the observed scattering can be com- 
pared directly with the model. Recent measurements of the angular 
dependence of the elastic scattering of neutrons at 4-1 Mev by WALT 
and BEYSTER [29] can be represented fairly well by a complex potential 
well with the same values of V, and R used before. The imaginary part, 
V,, must be increased by at least a factor 3, which indicates an increased 
probability of compound nucleus formation at this increased energy. 


If the neutron energy is raised still higher, the general pattern of the 
angular distribution no longer depends sensitively on the details of the 


The Compound Nucleus 157 


potential well other than the radius. The dependence on radius is 
represented by the well-known diffraction pattern of a circular disc. 
Only a very refined study would provide any further information about 
the potential well. 


So far such studies have only been made with protons. For several 
elements scattering of 10 Mev protons [30] has been measured, and the 
results for oxygen have been analysed on the optical model. They are 
well represented by V, = 30 Mev and V, = 5 Mev [31]. Also the 
scattering of protons of approximately 20 Mev has been measured [32]; 
and SAXON and Woop [33] have attempted to explain the measurements 
in terms of the optical model. They are successful in reproducing the 
main features of the experiment by increasing the imaginary part to 
10 Mev and by rounding off the edges of the square well. The region of 
rounding in which the potential rises from the value — V, to zero has to 
be made about 1 x 107! cm wide. This rounding off is a very natural 
change, and it should be expected to be of importance at higher energy 
and for scattering at large angles. 


Compound nucleus formation. The difficulties of comparing the 
theoretical predictions of the cross section o, for the formation of the 
compound nucleus with experimental results can be overcome in the 
following way. At low energies the predicted values of o, are supposed 
to be average values over resonances; consequently, we can make use 
of the fact that the resonances are well represented by the Breit-Wigner 
formula. We then obtain the following relation for o, in the case of 
very low energies (neutrons of / = 0 only): 


i 
OG, = 20242 (2) = Ee) 


v 


as. 
where (=) is the average of the neutron width over the level distance 


taken for neighbouring levels, and C(A) is a function of the atomic 
number. This expression allows a direct check of the predictions for 
o, at low energies. 

It is easy to see that our model predicts maxima for vo, = C(A) at 
values for which a standing wave can develop within the nucleus. This 
is the case when V2mV, R = wh(n + 4), where n is an integer. Hence, 
within the range of nuclear radii available, we should expect a maximum 


r 
for vo, and also for (=) near the mass numbers A ~ 11, 55, and 155. 


The maximum at A ~ 155 has been established by CARTER, HARVEY, 
HuGuHes and PiLcHer [34]. It is not quite as high as the optical 
model would predict with the constants established by the scattering 


158 F. L. FrRmEDMAN AND V. F. WEISSKOPF 


experiments. However, fluctuations in the A-dependence of the radii and 
strong deviations from sphericity might cause a flattening of the ex- 
pected maximum. At A ~ 55 and 11 the level spacing is large, and it is 
more difficult to measure the constants of several levels in order to 
obtain a valid average. Quite recently, however, a maximum at A ~ 55 
was found by Cote and BOLLINGER [35]. 

The predictions of the optical model in respect to o, can also be 
checked by comparing them with the observed reaction neutron cross 
sections, o,. The cross section o, must be larger than o,, and the differ- 
ence must be accounted for by compound elastic scattering. The 
quantitative agreement with the results derived from the square well 
potential is somewhat less satisfactory than the agreement of the total 
neutron cross sections. 

At | Mev the experimental values exhibit pronounced maxima and 
minima as a function of A [28]. The theory also shows maxima and 
minima, but with the values of R used above they occur at wrong values 
of A for the heavy elements [25]. By using smaller radii for these 
elements [36], this discrepancy can be removed at least to within 
present experimental accuracy*. (The necessary modification of R does 
not shift the peak in vo, too far and even improves the fit to o,,,..) At 
higher energy the maxima and minima are jess pronounced both 
experimentally and theoretically; however, the experimental cross 
sections are somewhat smaller in general than the theoretical prediction 
given by the square well potential [29]. Rounding the edge of the 
potential well will increase the theoretical values. 


This application of the complex potential model to the elastic 
scattering and to the compound nucleus formation is in some respects 
a more incisive test than the agreement of the calculations with total 
cross sections. Here the agreement depends on the detail of transferring 
the fluctuation from one cross section to another; whereas, the total 
cross section, depending directly on <7), is completely independent of 
this split of absorption and scattering. As the test here is somewhat 
more sensitive than that provided by total cross sections, it is not too 
surprising that the agreement is somewhat less good. 


In spite of these shortcomings, the values of the reaction cross sections 
also indicate an appreciable change of absorption with increasing 
energy. At 1 Mev the reaction cross sections fall within the range of 
values predicted if we use the absorption deduced from the total cross 


* It may be noted that the comparison of the theory and the experiments concerning 
the reaction cross section in this energy range severely test them both. The reaction 
cross section is measured as a difference between the total cross section and the integrated 
differential elastic cross section. This difference is only ~ 4 of the total. 


The Compound Nucleus 159 


sections. At 14 Mev [37] they are all very close to the geometrical 
maximum value 7(R + A). This is an indication of weak absorption 
at 1 Mev and of strong absorption at 14 Mev. 

We have assembled in Fig. | the values of the coalescence coefficient 
(4) needed in the various energy regions to represent the compound 


3 bes —— — ma = eee 


4 a | 
te N 
2 === Ze Seas 


l ie =| 
Hah 
fe) Sen a es 
t 2 4 ‘Oo 20 40 IOO 200 


Fig. 1. The coalescence coefficient K in nuclear matter as a function of the 
incident energy. The information is collected from the following sources: 


At I Mev, FESHBACH, PorTER, and WEIssKoprF, Phys. Rev. 96, 448 (1954). 

At 4 Mev, WALT and BeystTEr, to be published. 

At 10 Mev, Prowse and Hossain, private communication. (Unfortunately, the 
uncertainty of these measurements is not known tous. We have indicated that 
some uncertainty undoubtedly exists based on general experience with fitting 
data of similar type.) 

At 14 Mev, GitTincs, BARSCHALL and EveRHART, Phys. Rev. 75, 610 (1949); 
PHILLIPS, DAvis and GRAVES, Phys. Rev. 88, 600 (1952). (A lower limit for K is 
all that is established.) 

At 20 Mev, SAxon and Woop, Phys. Rev. 95, 577 (1954). 

Above 40 Mey, the curve is taken from T. B. TAYLor, Phys. Rev. 92, 831 (1953). 


nucleus formation. The curve connects the few known values of K and 
is intended to give a qualitative orientation only. A more accurate 
representation would not now be meaningful, for the values inferred 
from experiment are good only to within a factor of two at best, and 
they also depend on the assumptions made as to the shape of the 
potential well. The most characteristic feature of K is the strong rise 
within the first ten Mev. This rise very probably reflects the rapid 
increase in the number of possible ways for the incident particle to 
excite the nucleus. The drop at higher energies probably can be 
ascribed to the decrease of the elementary nucleon- nucleon interaction 
cross section with increasing relative energy. 


160 F. L. FRIEDMAN AND V. F. WEISSKOPF 


8. In this essay we have tried to show that a more careful application of 
Bonr’s original ideas and an analysis of the vastly increased experi- 
mental material leads to a modification of the primitive picture 
previously used to describe nuclear reactions: According to the old pic- 
ture, the incident particle hits the target nucleus and forms a compound 
system in which its energy is shared among all constituents. The 
compound system decays into the reaction products in a way which is 
independent of the process of formation. 


Now this view must be modified in several ways. The formation of 
a compound system does not necessarily occur whenever the incident 
particle penetrates the nucleus. In fact, the effect of the target nucleus 
upon the particle can be described reasonably well (when averaged over 
resonances) by a complex potential. Part of the effect is simply a 
scattering in which the target nucleus acts as a potential well only. The 
other part is the compound nucleus formation which occurs with a 
much smaller probability than previously anticipated. Here we under- 
stand by compound formation any process in which the incident 
particle is removed from the entrance channel. It includes not only the 
processes in which the incident nucleon shares its energy with the 
whole nucleus, but also energy transfers to one or to a few constituents 
of the target nucleus. 


The decay of the compound system into the reaction products 
depends upon the detailed mechanism of the energy transfer. The 
decay is independent of the mode of formation only in certain limiting 
cases. In general, the reaction products and their energetic and angular 
distribution will depend on the special conditions that prevail when the 
compound system is formed. In order to understand, classify, and cal- 
culate the different mechanisms that come into play in nuclear reactions, 
a detailed study will be necessary of the interactions between the inci- 
dent particle and the individual and collective motions of the nuclear 
constituents. 


Bour’s compound nucleus picture provided insight into many 
phenomena; it demonstrated its “peculiar facilities for a comprehen- 
sive interpretation of the characteristic properties of nuclei in allowing 
a division of nuclear reactions into well-separated stages to an extent 
which has no simple parallel in the mechanical behaviour of atoms.” [1]. 
Now after two decades our knowledge extends beyond the limits of this 
description. 


REFERENCES 


[1] N. Bour; Nature 137, 344, 1936 
[2] FERMI et al.; Ric. Sci. 5, 282, 1934 
[3] BrETHE; Phys. Rev. 47, 747, 1935 


The Compound Nucleus 161 


Perrin and Exsasser; J. de Phys. 6, 195, 1935 
Fermi et al; Proc. Roy. Soc. A 149, 522, 1935 
Beck and Horstey; Phys. Rev. 47, 510, 1935 
[4] ByercE and WestcotT; Proc. Roy. Soc. A 150, 709, 1935 
Moon and TILMAN; Nature 135, 904, 1935 
SzZILARD; Nature 136, 849 and 950, 1935 
FERMI and AMALDI; Ric. Sci. A 6, 544, 1935 
FRIscH et al.; Nature 137, 149, 1936 
FriscH and PLaczex; Nature 137, 357, 1936 
PREISWERK and HALBAN; Nature 138, 163, 1936 
[5] Brerr and WIGNER; Phys. Rev. 49, 519, 1936 
[6] BeTHE and PLaczex; Phys. Rev. 51, 450, 1937 
[7] EtseNBUD and WiGNER; Proc. Nat. Acad. (U.S.A.) 27, 281, 1941 
WIGNER and EISENBUD; Phys. Rev. 72, 29, 1947 
TEICHMAN; Phys. Rev. 77, 506, 1951 
[8] FesHBACH and WEIsskopF; Phys. Rev. 76, 1550, 1949 
[9] FRENKEL; Sov. Phys. 9, 533, 1936 
WEISSKOPE; Phys. Rev. 52, 295, 1937 
Lanpbau; Phys. Zeits. Sow. 11, 556, 1937 
[10] BLAser ef al.; Helv. Phys. Acta 24, 3, 1951 
[11] GucELotr; Phys. Rev. 81, 51, 1951 
Graves and Rosen; Phys. Rev. 89, 343, 1953 
[12] BarscHaLL; Phys. Rev. 86, 431, 1952 
MILLER et al.; Phys. Rev. 88, 83, 1952 
WALT et al.; Phys. Rev. 89, 1271, 1953 
OKAZAKI, DARDEN and WALTON; Phys. Rev. 93, 461, 1954 
NERESON and DaRDEN; Phys. Rev. 89, 775, 1953 and Phys. Rev. 94, 1678, 1954 
Cook and Bonner; Phys. Rev. 94, 651, 1954 
{13] GuceLor; Phys. Rev. 93, 425, 1954 
PAUL and CLARK; Can. J. Phys. 31, 267, 1953 
ErsBerG and Ico; Phys. Rev. 93, 1039, 1954 
McManus and SHARP; Phys. Rev. 87, 188, 1952 
{14] BLatr and WeissKopF; Theoretical Nuclear Physics, Ch. IX, § 5 
{15] GHosHAL; Phys. Rev. 80, 939, 1950 
{16] B. ConEen; Phys. Rev. 92, 1245, 1953; CoHEN and NEwMan, to be published 
[17] Bour, Perris and PLaczEK; Nature 144, 200, 1939 
[18] BERNARDINI, BooTrH and LINDENBAUM; Phys. Rev. 88, 1017, 1952 
GOLDBERGER; Phys. Rev. 74, 1269, 1948 
[19] BerHe; Phys. Rev. 53, 675, 1938 
[20] Wetsskopr; Helv. Phys. Acta 23, 187, 1950 
[21] DreLt and HUANG; Phys. Rev. 91, 1527, 1953 
BRUECKNER, LEVINSON and MamuHouD; Phys. Rev. 95, 217, 1954 
[22] Jonson and TELLER; Phys. Rev. 93, 357, 1954 
[23] WetsskopF; Science 113, 101, 1951 
[24] Betue; Phys. Rev. 57, 1125, 1940 
FEeRNBACH, SERBER and TAYLor; Phys. Rev. 75, 1352, 1949 
[25] FesHBAcH, PorTER and WeissKopF; Phys. Rev. 96, 448, 1954 
R. K. Apair; Phys. Rev. 94, 737, 1953 
FESHBACH, PorTER and WEISSKOPF; Phys. Rev. 90, 166, 1953 
[26] BLatr and WelsskorF; Theoretical Nuclear Physics 
[27] T. B. TayLor; Phys. Rev. 92, 831, 1953 
[28] WaLT and BARSCHALL; Phys. Rev. 93, 1062, 1954 


162 F. L. FrRmeDMAN AND Y. F. WEISSKOPF 


[29] WALT and BeysTER; to be published in Phys. Rev. 
[30] BURCHAM, GIBSON, HossAIN and RoTBLAT; Phys. Rev. 92, 1266, 1953 
[31] PRowse and Hossain (Bristol); private communication 
[32] BurkIc and WricHT; Phys. Rev. 82, 451, 1951 
CouEN and NeipicH; Phys. Rev. 93, 282, 1954 
GuGELOT; Phys. Rev. 93, 425, 1954 
[33] SAxon and Woop; Phys. Rev. 95, 577, 1954 
[34] CarTER et al.; Phys. Rev. 96, 113, 1954 
[35] Core and BoLLinceR; Bull. Amer. Phys. Soc. 30, (No. 1), 23, 1955 
[36] W. S. EMMERICH; personal communication 
[37] GitTINcs, BARSCHALL and EVERHART; Phys. Rev. 75, 610, 1949 
PHILuirs, DAvis and Graves; Phys. Rev. 88. 600, 1952 


NUCLEAR FISSION AND 
NUCTEAR STABILITY 


John Archibald Wheeler 


FISSION, with all its revolutionary consequences for the world, is only 
one of several elementary mechanisms of nuclear transformation. Yet 
compared to neutron, proton, alpha, beta and gamma processes, 
nuclear division calls forth a uniquely drastic kind of nucleonic 
rearrangement. The unusual character of this process has given it a 
special part in the development of nuclear theory. 

SANTAYANA speaks in Persons and Places of a life of searching into 
strange worlds, unexpected customs, marvellous ideas and new ways 
of thought. But he speaks of no Copenhagen where the web of mystery 
was unspun; nor of one who endlessly sought to bring the strange and 
unusual into harmony with the clear and well established. In the 
struggle for harmony in nuclear physics fission had a special place. 


The effect discovered by HAHN and STRASSMANN in 1938-39 fell 
harmoniously into the pattern of nuclear reactions envisaged in BOHR’s 
1936 compound nucleus picture and BouR and KALCKAR’s 1937 
liquid drop model. The deformation that leads to fission is a natural 
extension of the lowest capillary oscillation of the droplet. The com- 
petition demanded by the compound nucleus concept between neutron 
emission, radiation and fission as mechanisms for disposal of the excess 
energy appears clearly in the curve for cross section for neutron 
induced fission as a function of energy. The existing scheme of ideas 
gave a reasonable account of many of the main features of nuclear 
fission [1], and more dramatically, forecast [2] the vital difference 
between U-235 and U-238 months before that distinction could be 
confirmed by experiment. 

The observation of the unusual thus gave support to a nuclear 
model and a picture of nuclear reactions, originally developed to 
account for quite different kinds of experimental evidence. Thus applied 
at an early stage to fission, the statistical account of nuclear reactions 
has since received a far reaching development, due not least to WEIss- 
KopF and his associates. This account is extraordinarily simple. The 
probabilities per second for the various modes of disposal of the energy 


163 


164 JOHN ARCHIBALD WHEELER 


of the compound nucleus are idealized as depending upon excitation 
and number of nucleons in a reasonably uniform manner. These 
probabilities allow one to predict as a function of energy cross sections 
for a variety of reactions of the type A + (original nucleus) > B + 
(final nucleus), in broad general agreement with most of a great mass 
of experimental evidence. 


This fruitful and fast moving method of “correlation in outline” 
characteristically left unexplained many important observational 
details. Looking back, one sees in nucleonics as in so many other parts 
of physics how much progress can be made on the basis of simple 
principles and points of view which themselves lack at the moment any 
possibility of deeper justification, and which eventually even have to 
be reassessed as rather drastic simplifications. One now recognizes 
that the primitive liquid drop model and the quite different independent 
particle picture of nuclear constitution each fail to account for impor- 
tant aspects of the evidence. Instead, there exists a broader picture, 
the collective model or unified model [3], which recognizes both indi- 
vidual particle and droplet aspects of the motion of a large collection 
of nucleons, and in the evolution of which Bour has had a guiding 
role. 


The present account briefly reviews the facts and considerations that 
have led to the unified model. It then touches on fission as envisaged 
in this collective model. It goes on to illustrate some of the relevant 
ideas by analyzing the thousand-fold difference in the spontaneous 
fission rate of nuclei of odd and even mass numbers. Finally it estimates 
the limits placed by spontaneous fission upon the size of the heaviest 
nuclei, concluding that nuclei should be subject to experimental 
study which have masses two or more times greater than the mass of 
uranium. May this analysis expiate a failure of the past, when 
theoretical physics did not take existing models sufficiently seriously 
nor follow out their consequences to the limit, thus delaying for two 
years the discovery of nuclear fission! 


The unified nuclear model 


As statistical predictions were confronted with experiment, an increasing 
body of evidence revealed characteristic effects at variance with the 
most primitive forms of the compound nucleus picture or the liquid 
drop model or both. All of these effects point to a more or less free 
motion of individual nucleons through the nuclear interior: (1) 
regularities in ground state spins and magnetic moments; (2) regu- 
larities in the energy level patterns of light nuclei; (3) regularities in the 
dependence of neutron scattering cross sections upon energy and mass 


Nuclear Fission and Nuclear Stability 165 


number, and an account of these regularities in terms of a simple 
potential for the neutron with a complex absorptive term; (4) similar 
features in the cross section for elastic scattering of protons; (5) some 
correlation in direction between bombarding neutron and separating 
fission fragments, qualitatively consistent with the picture of a nucleon 
initiating a deformation leading towards fission before it has been 
assimilated into a compound system; (6) under bombardment of 
medium and heavy nuclei by 15 Mev to 20 Mev protons, a division of 
the spectrum of outgoing particles into two parts of which one is 
describable approximately as a Boltzmann-like evaporation from a 
compound nucleus, whereas the other represents particles that have 
undergone only a small energy loss. Some of these effects suggest a 
division among bombarding nucleons between those that are assimilated 
into a state of the compound nucleus, and those that merely respond 
to the average potential of the nucleus. 


The more or less free motion of nucleons through the nuclear 
interior implies an origin for collective modes of motion quite different 
from that envisaged by the liquid drop model in its most primitive form. 
There the nucleons were idealized as having negligible power of pene- 
tration through nuclear matter in analogy to H,O molecules stopped 
in the first molecular layer of water. It was reasonable in that picture 
that the particles should partake of a common motion, such as a collec- 
tive vibration. But such modes of motion also come into evidence in 
the opposite extreme of free motion of the nucleons through the 
nuclear interior. Any given nucleon responds to the configuration of 
the other nucleons—in this extreme idealization—only when it arrives 
at the nuclear surface, or the region where the potential rises rapidly. 
There is a kind of momentary self-perpetuation in displacements of the 
local surface from its time average position. New particles coming up 
are turned back at nearly the same place. In so far as they in turn have 
an effect on the potential they move the boundary to a location not 
very different from its past position. Otherwise stated, the period of 
motion of the individual nucleons is short in comparison to the period 
of the collective vibrations defined by the nuclear surface. 


In the unified or collective picture just described, the general 
excitation of the system is a combination of individual nucleon motion 
plus collective vibration and rotation. The nuclear oscillation, like a 
molecular vibration, is characterized by a curve of potential energy as a 
function of deformation, this curve differing from one nucleonic state 
of excitation to another. Only in the statistical average will this potential 
agree approximately with the predictions of the liquid drop model. In 
detail there will be characteristic effects associated with individual 


166 JoHN ARCHIBALD WHEELER 


nucleon orbits. Such effects appear for instance in the case of spon- 
taneous fission. 


Odd-even differences in spontaneous fission 


The spontaneous fission rates of nuclei of odd mass number are 
observed to be slower than the rates for the corresponding even nuclei 
by a large factor [4]. The value of this forbiddenness factor in most cases 
lies between 10° and 10°. This circumstance indicates that odd nuclei 
have a higher barrier against fission than corresponding even nuclei, or 
require a greater change of shape to pass through the barrier, or both. 
One will expect a correlation between barrier height and width. Conse- 
quently the issue may be phrased: Why odd-even differences in barrier 
height? No alterations would be expected from the liquid drop 
idealization. Instead, the differences are reasonably attributable to 
specificities in nuclear binding. 


The order of magnitude of the observed odd-even disparities is such 
as would be explained by barriers perhaps 0-5 Mev higher for the odd 


5.0 


5.2 54 5.6 
—— PHOTOFISSION THRESHOLO, MEV——~ 


Fig. 1. For five nuclei one knows both the spontaneous fission half life 

(O. CHAMBERLAIN, G. FARWELL, J. JUNGERMAN, E. SEGRE and C. WIEGAND, 

summarized by E. SeGRE in Phys. Rev. 86. 21, 1952), and the threshold for 

photofission (H. W. Kocn, J. MCELHINNEy and E. L. GASTEIGER, Phys. 

Rev. 77. 329, 1950). The observations are consistent with, but do not 

prove, an increase of the height of the fission barrier with increasing 
life time: 


AE/A logyTy2 = — 0-02 to + 0-23 Mev. 


nuclei than for the corresponding even nuclei [5]. This estimate can be 
justified by noting that a photofission threshold of 5-15 Mev (Fig. 1) 


Nuclear Fission and Nuclear Stability 167 


goes with a half life against spontaneous fission of the order of 10158. 
years, or a barrier penetration probability of the order of 


1On@ésec/ OMS years = 10-44, 


Assuming that the barrier penetration exponent is roughly proportional 
to the barrier height, one will expect a change in life by a factor 10%, the 
order of the odd-even difference, for an alteration in barrier height of the 
general magnitude 


AE ~ (3/44) 5:15 Mev = 0:35 Mev. 


This result is uncertain by a factor perhaps a little less than 2. Inde- 
pendently, the observations compiled in Fig. 1 can be interpreted to say 
that a change by a factor 10° in barrier penetration probability goes 
with a change in barrier height, AE, between — 0-07 and + 0-70 Mev. 
An increase in the number and precision of fission threshold measure- 
ments would make a great improvement in these estimates. 


In a private discussion AAGE BOHR suggested that the lower barrier 
for even nuclei might reasonably correlate with the circumstance that 
heavy even-even nuclei always have zero spin. It is possible to develop 
and test this view. It will be recalled that the spectrum of low lying 
levels of heavy nuclei correlates beautifully with the picture developed 
by Bour [6], and BOHR and MOTTELSON [7], for nuclei with axial 
symmetry undergoing collective rotation. Accordingly it is appropriate 
to assume that one can speak in a well defined way about the intrinsic 
angular momentum, Q/, of the nucleonic system about the symmetry 
axis. For any given value of Q let V,.(«) represent the intrinsic, or 
nucleonic, energy content of a nucleus which has a given deformation, 
a, but possesses no kinetic energy of collective motion, either vibrational 
or rotational. It is well known in the case of molecules how to find 
this deformation potential from the spacings of the vibrational and 
rotational levels [8], and we shall assume that the same type of analysis 
is in principle possible for heavy nuclei of not too great excitation. In 
other words, the distortion energy is thought of as defined by the Klein- 
Rydberg type of analysis, not by way of idealized experimental arrange- 
ments designed to stop or allow for the collective motion of the nucleus. 


For even nuclei, as AAGE BOHR emphasizes, the nucleonic state 
Q = Olies lowest at all deformations, whereas for odd nuclei the intrinsic 
angular momentum, (*, of the state of lowest energy will change from 
one value of « to another. Accordingly it is appropriate to define for 
odd nuclei with axial symmetry a specialization energy, Gp(«) (Fig. 2). 
For any given distortion, «, and any given intrinsic angular momentum, 
Qh, this quantity represents the difference between the deformation 


168 JouHn ARCHIBALD WHEELER 


potential, Vo(«), for the given © value and the potential curve, Vgs(«), 
that lies lowest at the given deformation. 

The qualitative properties of the specialization energy show up 
already in the extreme individual particle picture, the idealization in 
which nucleons are regarded as running freely through the nuclear 
interior and as constrained only by the sudden rise in potential near the 


fo) Ol 02 03 
— DEFORMATION PARAMETER, 5 —> 


Fig. 2. Specialization energy as a function of nuclear deformation for a 
nucleus with 95 protons. The specialization energy represents the excess 
of the energy of a nucleus with a given spin, parity and deformation over 
the energy of the same nucleus for the same deformation in the spin-parity 
state of lowest energy. This quantity may be considered to measure the 
amount by which the fission barrier for an odd nucleus exceeds that for a 
neighbouring even-even nucleus, The curves are calculated from the primi- 
tive single particle picture described in the text and have illustrative value 
only. 


surface. In the most elementary approximation the nucleonic energy 
of the system is regarded [2] as the sum of five terms: a main binding 
term; a correction for iacomplete binding at the surface proportional 
to the extension S(«) of this surface; an electrostatic potential energy, 
v(a); the sum of the proper energies, T,(«), of nucleons bound within 
a configuration of the given form; and the energy of pairing of all 
those nucleons, 2P in number, each of which is paired off with another 
nucleon in an orbit identical except for the sign of the Q value: 


Vo(a) = — BA + O,,,S(a) + v(a) + 2; T(«) — PA. 


Fig. 3 illustrates the qualitative dependence of individual nucleon 
energy, E;, upon deformation as determined by NILsson [9]. He 
assumed a harmonic oscillator potential with unequal force constants 
along and perpendicular to the symmetry axis and took a particularly 


Nuclear Fission and Nuclear Stability 169 


simple model for the spin orbit coupling. It will not be necessary to 
ask to what extent the order of levels may be changed in a different 
model, for the general features that interest us will not be significantly 
changed by these details. In an even-even nucleus with a deformation 
a, the lowest N/2 neutron states will be filled, and likewise the lowest 


Fig. 3. Section from NILsson’s diagram for single particle energies as a 
function of deformation; used in constructing the curves for specialization 
energy shown in Fig. 2. Heavy markings indicate the point to which single 
: particle levels are filled by 94 protons. 


Z/2 proton states. With an alteration in « there will occur a change in 
the value of Q for the top proton level, Q;,.,. By reason of their coupling 
with each other the two protons will readjust their orbits to the new 
values, Qt) and — Qyop, thus minimizing the energy of the system and 
keeping constant the intrinsic angular momentum 


OS 20, — 0 


of the system. The situation will be quite the same when the nucleus 
contains Z + 2 protons; then the last two protons go into state 
+ Ortop +1. In contrast, in the odd-even nucleus (Z + 1,N), the lowest 
level of the system for a given deformation will have the total intrinsic 
angular momentum Q2iop41, Which differs from deformation to defor- 
mation. To achieve instead a prescribed intrinsic angular momentum 
Q, the system will have to be excited. The excitation can normally 
be described in one of the following two ways, the choice between them 
being made on the basis of which makes the least requirement of energy: 


12 (12 pp) 


170 JOHN ARCHIBALD WHEELER 


(a) the odd proton is raised from the state Q,5)., to the next higher 
single particle state of angular momentum Q about the symmetry axis; 
or (5), out of a slightly lower lying and already filled state of quantum 
number Q, that proton is withdrawn which has quantum number — Q), 
and it is paired off with the (Z -+ 1)st proton in the state Qrop4i. In 
either case the energy of specialization is given by the formula 


Go(a) = | T(Q,«) me TQtop +150) | : 


In the one case the quantity within the absolute sign is positive, while 
in the hole case it is negative. This formula, together with NILSSON’s 
curves of Fig. 3, was used to construct the purely illustrative curves for 
the specialization energy shown in Fig. 2, The number of proton pairs 
is of course the same in the cases (a) and (6): P = Z/2. Consequently 
the difference of energy between the odd nucleus and the average of the 
two neighbouring even nuclei, according to the present primitive 
analysis, will be 


Ve; Z+1,N,2)— Ve; Z + Ni, Oe Ve, Zee) 
= Go(a) + $A. 


This quantity has to be added to the deformation potential of typical 
even-even nuclei in order to obtain the deformation potential of the odd 
nucleus of the given intrinsic angular momentum. What is relevant for 
fission is of course not the absolute deformation potential, but the 
changes in this quantity as « increases. The pairing energy, \, will be 
expected to vary approximately inversely with the nuclear volume but 
to be nearly independent of nuclear shape. Subtracting this irrelevant 
constant, we conclude that the deformation potential for an odd 
nucleus is 


Voaal®) = Veven(%) + Ga(a). 


In this expression for the shape of the fission barrier the specializa- 
tion energy Gp(«) has of course been derived from a rather primitive 
model. However, a quantity with the same significance will be expected 
also in a deeper analysis of the nuclear energy level pattern as it is 
affected by deformation. In particular, anomalously large values of 
Ga(«) will occur near magic numbers when at the same time the 
deformation is not too large and Q takes on certain values. Moreover, 
the predictions about Gp(x) from the primitive model, however wrong 
in detail, will be expected to be of the correct order of magnitude. In 
addition, the values Gy/(«), G3/o(x),. . . for a given deformation will 
give the excitations of the various levels of a nucleus with a fixed 
deformation. These energies—minimized—should correspond to the 


Nuclear Fission and Nuclear Stability 171 


heads of vibration-rotation bands—special levels which are now in the 
process of being untangled [10]. The experimental statistics of the 
Spacings of these band-head-levels, when available, should therefore 
provide a direct check—even if only a statistical check—of the 
theoretical spacings. 


In the case of an odd nucleus undergoing spontaneous fission, the 
normal state spin will be identified with {2. For the normal state 
equilibrium deformation a state of this ( value will lie lowest, and the 
specialization energy will vanish. This quantity, Gp(«), then increases 
during the penetration of the fission barrier. The increase is not 
uniform, but has more the character of a random walk or diffusion 
process. Consequently it will be reasonable to estimate the order of 
magnitude of G as roughly proportional to the square root of the 
deformation. The increase of « in the traversal of the fission barrier 
will be of the order of unity, even when the nucleus starts with an 
equilibrium deformation of the magnitude « ~ 0-25, 6 ~ 0-38 such 
as is deduced from E2 transition rates, quadrupole moments, and 
isotope shifts [11]. The range of deformations covered in Fig. 2 goes 
only from « = 0 to « = 0-19, or 6 = 0 to 6 = 0-3. In that range the 
increase of the typical G in the figure is of the order of } Mev. Conse- 
quently it is reasonable to estimate a total increase of ~ | Mev in 
going through the barricr, or an effective average increase of roughly 
(2/7) | Mev ~ 0-7 Mev. This alteration in barrier height is of the 
right order of magnitude to explain the observed dramatic slowing of 
spontaneous fission for odd nuclei. 


It would seem difficult to escape such a rise in the fission barriers for 
odd nuclei, but it evidently does not follow that the rise will be the same 
for all heavy odd nuclei. Even for nuclei of the same spin there will be 
a fluctuation of the average valuc of the specialization energy from case 
to case. The “random walk’’ evidenced by the Go(«) curves of Fig. 2 
shows of the order of 10 steps in a range Ax ~ 0-2, implying perhaps 
~ 50 steps in the passage through the fission barrier, or ~ 25 steps to 
the halfway point. Consequently familiar statistical arguments would 
suggest variations in the height of the fission barrier for odd nuclei of 
the same spin, in a region far from magic numbers, of the order of 
0-7 Mev/(25)'/2 ~ 0-15 Mev. The corresponding lifetimes may there- 
fore be expected to depart from a smooth curve by a factor of the order 
of 10 -}- 1, a fluctuation that is not incompatible with the observations. 
In the case of an odd nucleus of very high spin the energy of specializa- 
tion will normally be considerably increased, because the spacings are 
large between levels of high Q value (Fig. 3). An illustration of this 
effect appears in the curve for G,4/.(«) in Fig. 2. Here the specialization 


eZ; JouHN ARCHIBALD WHEELER 


energy is roughly doubled, corresponding to a factor of increase in 
fission life, over even nuclei, not of 10°, but more like 10°. The spins 
obtained spectroscopically or inferred from the structure of rotational 
bands up to now do not include values so large as 11/2, so that no check 
on this prediction is yet possible. 


Stability of very heavy nuclei 


Spontaneous fission being the most impressive factor that limits the 
stability of very heavy nuclei, it is interesting to project this and other 
limits into the region of very large masses to estimate which nuclei 
will live long enough to be studied. The relevant effects besides fission 
will be alpha decay, beta decay and instability against spontaneous 
emission of neutrons. Proton emission is much less important than 
alpha decay among these heavy nuclei as a means to reduce an abnor- 
mally high ratio of protons to neutrons. 


A detailed treatment of the several decay mechanisms in the region 
of mass numbers A = 250 to A = 600 is impossible in default of know- 
ledge where closed shells occur. However, overlooking such details, 
one can expect an account of the main features of the stability from the 
semi-empirical analysis of nuclear binding energies, such as has already 
been used to correlate approximately nuclear binding energies below 
A = 250:* 


M(Z,A)(in A.M.U.) = 1:00812Z + 1-00893(4 — Z) 
— A{0-01504 — 0-083[(44 — Z)/A]?} 
+ 0:014A4? + 3Z%e?/5r, A¥3, 


with 4, = 1:48 x 10-" cm = 324/35 x 627 & 10- "Ae 


Odd-even corrections have been omitted from this formula in accor- 
dance with the spirit of the present inquiry. Only even-even nuclei 
will be considered in dealing with spontaneous fission. In the same way 
we shall look apart from cases of high nuclear spin in beta and alpha 
decay where forbiddenness factors impart a more than average stability 
to a few exceptional nuclear species. Accordingly, it is appropriate to 


* See C. CoryYELL, chapter on Beta-Decay Energetics in Annual Review of Nuclear 
Science, Vol. 2, 1953 (Annual Reviews, Inc., Stanford, Calif.) for a review of different 
formulations of the semi-empirical mass formula and comparison with experiment. The 
constants used in the formula in the text are those of Fermt, because of the very extensive 
tables of nuclear masses computed from these constants by N. METROPOLIS and G. 
REITWIESNER, U.S. Atomic Energy Commission Document NP-1980 (1950), although 
other choices of the constants lower the discrepancies in calculated mass values by a 
factor of about 2. 


Nuclear Fission and Nuclear Stability 173 


ask for the location of those nuclei, (Z,A)erit, for which the breakup 
probabilities, estimated on the assumption of allowed transitions, 
approach some critical rate. The choice of the corresponding critical 
times depends of course on the type of observation in mind. One year 
may be chosen rather arbitrarily as typical of long-term irradiation 
experiments, and 10~‘sec. as representative of certain types of fast 
timing work: 

Aj cou 0:693/1 year = 2:20 x 10-8/sec. 

Aswae— 0°93 X 10%/sec. 


Spontaneous fission of very heavy nuclei 


To estimate stability limits of even-even nuclei against spontaneous 
fission requires a more detailed analysis of the fission barrier. So far 
barriers have been compared only for odd and even nuclei of neigh- 
bouring charge and mass numbers; now a means has to be found to 
relate the fission activation energies for quite different nuclei. For this 
purpose the most primitive relevant idealization of nuclear properties 
is supplied by the liquid drop model. In that picture the deformation- 
dependent part of the nuclear energy, V(«), arises (1) from an incom- 
pleteness of saturation for particles at the nuclear surface—an effect 
proportional to the surface itself: S(«) = 4iR2f,(a); and (2) from a 
change in the electrostatic potential that also depends upon the geo- 
metry of the boundary; thus, 


V(a) = 4rrr2A??O f(a) + (327e2/SryA™)k f(a). 


At the point of passage over the fission barrier, all the OV/dx, are 
zero—a set of conditions that determine all the deformation coordinates, 


&», %3,. . -, in terms of the coefficients of f(a) and f,(a) in V(a)—or 
indeed, in terms of the ratio, x, of these coefficients: 
Ze Z/A 
a = 


~ 104 O (4nr3/3) (Z7/Adecit 

Thus the potential rise, expressed in dimensionless form, 
AV(a,x)/4rrgA"?O = [ fae) — f(0)] + x1 f(a) — f0)] = flo), 

has at the saddle point a value that depends upon the parameter x alone: 

E,4ar2 A280 = f(x). 


The idealized liquid drop picture uniquely defines values for the 
functions f,(«) and f,(), and consequently for the dimensionless 


12A 


174 JOHN ARCHIBALD WHEELER 


measure of barrier height, f(x). Thus, the power series development for 
(a,x) runs* [12]: 
[(2/5)a2 + (116/105)03 +... 
+ (S/7)oez +... 
+ (9/9)o2 +... + terms in ofa, +...) 
+ x[— (10/25)a2 — (128/105\ag 4+... 
— (20/49)o2.. . 
— (30/81)02. . . + terms in o3a,+.. .] 


When x is close to unity, the values of the several «’s near the saddle 
point are very small, and dominating terms in the deformation function 


become 
f(a,x) = AL — xa? — po’, (1) 


* See also the detailed calculation of the critical form for four values of x by S. FRANKEL 
and N. METROPOLIS; Phys. Rev. 72, 914 (1947). One normally thinks of the surface 
configuration of the liquid-drop as defined by an expression of the form 


RO) = Roll + % + > OP, (cos 9)] 
n=2 


where &, is determined in terms of the other deformation parameters by the condition of 
volume normalization. An alternative formulation is 


RO) = Real! + 3S apPa (cos 6}, 
n=2 


where ay is now the normalization constant. Still a third description of the surface 
expresses the cylindrical coordinates of the typical point, z, p, in terms of prolate spheroidal 


coordinates, 
2a ay§eurtN3 P= a,[(E*sure = KG) = 77), 


where a, is again fixed by the normalization requirement, and 
loa) 
Esurt = &,[1 a5 2 tnPo(M1, 
n= 


and «, = 1/3§)?. The expansion of the deformation energy agrees to terms of the second 
order for all three definitions, but higher order terms naturally depend upon the choice 
of definition. Of course all physically significant results are invariant with respect to this 
choice of coordinate system. The power series in the text is based upon the third definition. 
It is sometimes convenient to introduce for pure spheroidal deformations a canonical 
elongation coordinate, &, such that 


exp & = semimajor axis/Ry = (1 — &)~2)-¥8 = (1 — 3a,)-¥8. 
In terms of this coordinate 
fi=12£ 098 —Cppe +... 
fy = 2 — (2/5)a? — (2/105) +... 


See also a paper by U. L. BustNaro and S. GALLONE, on the asymmetry of nuclear fission 
(seen in preprint form through the kindness of the authors) for higher order terms in the 
expansion of the liquid drop deformation energy (Nuovo Cim., to be published). 


Nuclear Fission and Nuclear Stability | 


where « is shorthand for a; A, for 2/5; and yw, for 12/105. A simple 
calculation shows that /(«,x) goes through a maximum of height 


E,|4nr2 4230 = f(x) = (443/27 w2)(1 — x)*. 


Consequently one expects for the height of the fission barrier near 
x = | an expression of the form 


E, = gA*8[1 — (2?/A)[(Z7/A)erit (2) 
where on the liquid drop model 
g = 4nr5O[4(2/5)*/27(12/105)?] 
= 14 Mev (98/135) = 10 Mev. (3) 


Further from x = | a more complicated dependence upon x obtains [13]. 

In actual nuclei we cannot assume that the effective surface tension 
for the discussion of deformations is necessarily identical with the 
corresponding quantity in the semi-empirical mass formula. The same 
caveat goes for the quantity e?/r) in the formula for the deformation 
energy. Also the energy takes a minimum value, not at «, = 0, but at 
some other deformation (&)min. It therefore suggests itself to define 
X= &_ — (%)min and to define x in such a way that this fissility 
parameter is unity for those nuclei which no longer possess a fission 
barrier; but otherwise to hold to the simple two term formula for 
j(x.x) in terms of two empirical constants, 4 and yw. A formula of this 
type will be expected from almost any simple argument about a power 
series expansion near x = 1, a = 0. The real point at issue is deeper. 
Can the fissility parameter x be supposed still to have the simple form 


x = (Z°/A)[(Z*/A)erit (4) 


as predicted by the liquid drop model? The following discussion 
suggests that this assumption is not an unreasonable working 
hypothesis. 

SEABORG has attempted with some success to correlate the slow 
neutron fissility of 28 nuclei from A = 226 to A = 243 by way of the 
equation [14] 

E, = (19:0 — 0-36Z?/4) Mev, (5) 


with barrier heights ranging from 5-4 Mev to 6°8 Mev. A little com- 
putation shows that SEABORG’s numbers are represented equally well 
by the formula 

E, = 0-495 Mev A?[1 — (Z?/52-84)] (6) 


and by the formula 
E, = 0-805 Mev A?[1 — (Z?/86A)}°. (7) 


176 JoHN ARCHIBALD WHEELER 


The range of variation of Z?/A is too small to distinguish from each 
other a first power, second power, or third power law. This situation 
is not easy to remedy. Z?/A does not have to change much percentage- 
wise to carry fission half lives from values too long to observe to values 
too short for reasonable stability. For very heavy nuclei the interesting 
range of Z?/A will evidently be even smaller. Consequently the validity 
of (6) or (7) is limited to values of Z?/A in the range (88)?/228 = 34-0 
to (100)?/254 = 39-4. In particular it is not justified to try to identify 
either expression with the limiting formula (2) that is valid only very 
close to (Z?/A) crit. Moreover, this limiting value of Z?/A will be equal 
neither to 52-8 nor to 86, just as the constant 0-805 Mev is very far from 
the value (3) of 10 Mev. 


SEABORG’s exploratory analysis proved so promising that it would 
be very interesting to see the treatment extended to include (1) appro- 
priate correction for the extra barrier height of odd-A nuclei, (2) the 
available photofission threshold measurements, (3) the new improved 
Berkeley estimates of neutron binding energies, (4) observed spontaneous 
fission rates and (5) additional data from Berkeley and elsewhere since 
1952 on slow neutron fissility. Of course, even with these extensions 
one cannot expect to get complete correlation with the predictions 
of the liquid drop model. This picture would of course predict a 
sphere for the normal configuration of the nucleus, in contrast to the 
abundant evidence for large equilibrium deformations in the actinide 
series. Moreover, spontaneous fission rates for even-even nuclei are 
found to depend strongly not only upon Z?/A, as expected, but also 
upon A, as not anticipated from the small change of A?’? over the 
relevant region of mass values [15]. HUIZENGA suggests that this 
effect may arise from a dependence of the nuclear deformation upon A. 
Thus, in the process of shell filling each added nucleon produces an 
unsymmetrically distributed force of its own on the nuclear surface. 
The sum of these unbalanced forces over all nucleons above a closed 
shell may be regarded as responsible for the non-zero equilibrium 
deformation [16]. The quadrupole or «, part of the nuclear deformation 
—which determines the nuclear moment of inertia—increases in a 
rather regular way for nuclei beyond the closed shell at ,.Pb?. 
However, this component of the nuclear distortion levels off to a 
practically constant value for nuclei in the range A = 240 to A = 256, 
as indicated by the position of the first rotational level, J = 2, of 
nuclei in this region [17]. This circumstance is no real objection to 
HUIZENGA’s interpretation of the specific A dependence of nuclear 
fissility. Other significant parameters of the nuclear deformation, 
a, and «4, are not measured by the spacing of the even rotational levels. 
However, for several even nuclei in the range of masses A = 226 to 


Nuclear Fission and Nuclear Stability 177 


A = 232 an odd level with J=1 lies low enough to have been 
observed [18]. For this level Curisty [19] has advanced a tentative 
explanation that assumes a nuclear asymmetry, such as would be 
specified by the parameter «. A rotation of the nucleus by z, followed 
by a reversal of the sign of «3, brings the nucleus back to its original 
configuration [20]. That state of the «3 vibration mode which is anti- 
symmetrical in «3 lics above the lowest oscillational degree of freedom 
by an amount which—on Curisty’s picture—fixes the height of the 
revationaljleve] sequence 1~, 3-;. . . with respect)to 0+, 2+,.... If 
this reasonable interpretation of the 17 level is accepted, then it follows 
from the change in height of this level with A that the a, part of the 
nuclear deformation potential is also changing with A in a way quite 
different from what one would expect from the primitive form of the 
liquid drop model—and a way that has no simple correlation with the 
observed changes in the equilibrium value of «. Similarly one can 
anticipate changes in the equilibrium value of «, which will alter the 
nuclear fissility in an A dependent way that has no simple connec- 
tion with the predictions of the liquid drop model. We cannot expect 
to include such details in a simple formula for fission barrier as a func- 
tion of Z and A. 


It is remarkable how small seem to be the particularities in the 
fissility of even-even nuclei that follow from special circumstances 
in the order of filling of nucleonic orbits and from the deformations 
produced by orbital pressures. This uniformity follows from SEABORG’s 
correlation of the slow neutron fissility of 28 nuclei with Z?/A. It 
also comes into evidence in plots by SEABORG, WHITEHOUSE and 
GALBRAITH and HUIZENGA of spontaneous fission half life as a function 
of Z?/A for even-even nuclei over a range of life times that now reaches 
from 200 days to 1-4 x 1018 years [21]. These circumstances make it 
appropriate to discuss the spontaneous fission rates of even very heavy 
nuclei in terms of the liquid drop formalism. Of course the possibility 
cannot be excluded that one will make a bad error in this way. The 
rather small change of «, in the region where fission has been studied 
so far by no means rules out large changes in the equilibrium value of 
this deformation parameter for very large A. Little is known about 
how such an alteration would affect nuclear fissility. However, even 
with such changes, «, will be expected to have an average value 
which is positive [22] for heavy nuclei as it is in the actinide series. 
For this reason one will expect to get at worst something like an 
average picture of the fissility properties of heavy nuclei by adopting 
the scaling principle of the liquid drop model, both for the barrier 
height, E,, already discussed and for the spontaneous fission decay 
constant, A, This we shall now do. 


178 JoHN ARCHIBALD WHEELER 
In the formula for the decay constant, 
A, = 10” sec.“! exp (— 2/), (8) 


one ought in principle to allow for a slight dependence of the dimen- 
sional coefficient upon Z and A [14], but this dependence is slight and 
may be neglected in comparison to the great effect of changes in charge 
and mass number in the penetration exponent 


2 = 2A f[AV(a) — E}!?(SMj(dr,/da)?}?do, (9) 


where the sum goes over all nucleons. We wish to correlate this 
exponent with the barrier height, E,. To simplify the correlation, let the 
zero point vibration energy, E, be neglected in comparison with 
the barrier height, so that the first factor in the integral can be written 


[4rr20 A?/3f(x,a)]. 
The second factor has the form 
[(3/10)A*3M rég(«)]'”, 


where 2(«) reduces to 1 for small «, according to the liquid drop model. 
Thus the general scaling law for the penetration exponent is 


21 = (12/5)¥2[(4mr20)/(h2/ M72) /2A 78 ff Y2(x,c0) "(ard (10) 


Let one specialize to the case where the fissility parameter, x, is close 
to 1, and use the corresponding expansion, (1), for the dimension- 
less measure of deformation potential, /(x,«). Then the integral in (10) 
becomes 

STAC — xa? — pak} Ada = (45/2/15 u2)(1 — x)>/, 


In the same special case the barrier height goes as (1 — x)°, according 
to Eq. (2). Thus we can relate the penetration exponent directly to the 
barrier height, with the result 


21 = 9-5-92(24)-1/3(4err20)-Y3(A2/ M 2) V2 AIS E96 (11) 
= (A/const)”°[1 — (2?/A)/(Z?/A)erit]”, (12) 


where the power p is 5/2. This formula being derived in the limit in 
which E, is proportional to (1 — x)’, and Eq. (7) describing just such a 
dependence of E, upon (1 — x)%, with (Z?/A) ait = 86, one might be 
tempted to substitute this value for (Z?/A)q,i into Eq. (12) and choose 
the one remaining constant to fit the known half lives for spontaneous 
fission. But then the calculated slope of 2/ as a function of Z?/A comes 
out to be only about half the observed slope. In other words, Eq. (11) 


Nuclear Fission and Nuclear Stability 179 


and the x — 1 limit analysis behind it give much too slow a dependence 
of barrier width upon barrier height to fit the observations—a depen- 
dence as slow as E}/3. This outcome is not surprising, for actual nuclei 
are rather far from the ideal limit x = 1. 


In default of a theoretical determination of the constants in Eq: (12); 
it 1s appropriate to proceed empirically. Table I lists the fission lives 
for sixteen even-even nuclei [24]. For each case the experimental 
penetration exponent, 2J/, has been calculated and divided by A7/® to 


0.20 


oe fy Ze 
018 21=(eap) User | 


| ye 7 6.363 
*(Sa5p) L!- 


86 A 
0.16 
REDUCED 
PENETRATION 
EXPONENT 
Zea ® 


| 


0.14 


012 


0.10 


Fig. 4. Reduced penetration exponent, 21/A"/*, for the 16 even-even nuclei 

for which the spontaneous fission transformation constant is known 

(Table I). The dotted curve has the slope that corresponds to Eq. 12 with 

(Z?/A)crit = 86 and with the power p = 2:5. The two smooth curves 

correspond to Eqs. (13) and (14), and are extrapolated in the text to estimate 
the fission rates of very heavy nuclei. 


obtain the “reduced penetration exponent,” plotted in Fig. 4 as a 
function of Z?/A. 

One cannot draw a single smooth curve through the points, as 
demanded by the scaling law of the liquid drop model. However, that 
scaling law provides the only evidence for estimating the fissility of 
very heavy nuclei. Consequently one is led to represent the data by a 
best fitting average curve of the form (12), with three constants: 
const, (Z?/A) rit, and the power p. Two extreme cases have been 
considered: (1) (Z?/A)crit = 86, as in Eq. (7); then p has to be increased 


180 JoHN ARCHIBALD WHEELER 


Table I. Reduced penetration exponent, RPE, of even-even 
nuclei, calculated from observed rates of spontaneous fission 
and the definition 


0:693/T,/.(s.f.) (eee!) =e0” sec.—expi(— 27) = 10" aaee.Yenp (— ARP) 


Nucleus Tyjo(s.f.)(yr) = ees ZA 
90-Th-— 232 1-4 x 1018 0-1913 34-91 
92-U— 234 1:6 x 1036 0-1818 36°17 
92-U-— 236 Dears 0-1804 35:87 
92-U- 238 8-0 x 1045 0-1770 35°57 
94-Pu-— 236 3-5 x 10° 0-1538 37-43 
94-Pu- 238 4:9 x 101° 0-1568 37-12 
94-Pu- 240 jee 2 TORE 0-1568 36-82 
94-Pu— 242 6:7 x 101° 0-1543 36°51 
96-Cm-240 1:9 x 108 0-1383 38-41 
96-—Cm-242 7-2 x 108 0-1392 38-09 
96—Cm-—244 1-4 x 10’ 0:1389 37:77 
98-Cf- 246 2-1 x 108 0-1233 39-04 
98-Cf-— 248 7 x 108 0-1240 38-72 
98-Cf 250 1:5 x 104 0-1242 38-42 
98-Cf 252 66 0-1144 38-11 

100— 254 0:60 0-1060 39-37 


from 2:5 to 6-363 to fit the data; (2) p= 1; then (Z?/A),.;, has to be 
taken as small as 44-81 to fit the data. The two empirical formulae, 


27 = (A4/0-222)7/*[1 — (Z?/86A)]®-353 (13) 
and 21 = (A/1-085)7/*[1 — (Z?/44-814)] (14) 


provide equally good representations of the average trend of sponta- 
neous fission half lives in the experimental region. 


With the fission rates normalized in this way one is in a position to 
estimate the lives of very heavy nuclei, subject to all the provisos and 
uncertainties that have been mentioned. For a half life with respect to 
spontaneous fission of 10~* sec., or a penetration exponent of about 
41-8, and for a mass number of A = 600, one requires according to 
Eq. (13) a charge of Z = 173 (with E, = 4:3 Mev), and according to 
(14) a Z value of 162. The difference between these numbers, AZ = 11 
provides a measure, but only a partial measure, of the uncertainty in 
the limit for fission stability. For smaller A values the magnitude of 
AZ of course diminishes, approaching zero in the region of the known 


Nuclear Fission and Nuclear Stability 181 


nuclei to which the fission formulae have been adjusted. For definite- 
ness, all the curves for fission limits in Fig. 5 have been based upon the 


) - — = 


yA _ AE DY Telyr 
Q-DECAY Titec" |_-77"”— FISSION 
\ e 


ee FF “6 pie Wi) Y] 
Lap 7/7!" 
=H, Mee 2 Wi! 
l/l 


ESTIMATED LIMITS OF 
NUCLEAR STABILITY 


0 100 200 300 400 500 600 700 


Fig. 5. Estimates for the limits of nuclear stability obtained as described 
in the text. Decay rates for all processes were calculated assuming the 
minimum hindrance: spontaneous fission rates were estimated as if all 
nuclei had even-even character, whereas fission will be hindered by a factor 
of the order of 10**! for nuclei of odd A; neutron escape was estimated as 
if no centrifugal barrier intervened; and beta decay rates were calculated 
on the assumption of allowed transitions. Consequently it will be expected 
that there will exist special nuclei of exceptional stability standing well 
outside the indicated limits of stability. In so far as nuclei with lives of 
10~ sec. or more are subject to experimentation, the curves in the figure 
attribute an experimentally testable reality to nuclei with masses two or 
more times heavier than known nuclei. 


first power law of Eq. (14), as it predicts a smaller extension for the 
region of very heavy nuclei. 


Alpha decay 


In the case of alpha decay it will be sufficiently accurate for our purpose 
to represent the disintegration constant in the form* 


Am 2x 10Yaee exp (— 2), 


* See J. RASMUSSEN, S. THOMPSON and A. Guiorso, Phys. Rev. 89, 33 (1953), as well 
as the review article, ‘““Alpha Radioactivity,’’ by I. PERLMAN and F. Asaro, UCRL-2613, 
May, 1954, for a summary of various treatments of barrier penetration. The effective 
nuclear radius has been taken to be r,A¥/%, with ro = 1:52 x 10 18 = 0-540e?/mc? = 
e?/0:95 Mev. This standard Berkeley value, being empirical, evidently makes an implicit 
allowance for the average effect of nuclear deformations in lowering the potential barrier 
against alpha emission. 


13 


182 JoHN ARCHIBALD WHEELER 
where the penetration exponent has the value 

21 = — (8Z/137)(c/v)[arc cos x¥/2 — x/9(1 — x)¥?], 
with  v2/c? = Egis/(4Ma, reaucea €”) = E/(1 — 44) 1862 Mev 
and x = E,y,/barrier height + Eyj,4”9/Z 1-90 Mev. 


To make the calculated alpha decay constants take on the assumed 
critical values, the disintegration energies should have the values listed 
in Table II. On the other hand, the semi-empirical mass formula 


Table I. Alpha disintegration energies, required to make the half life 

against alpha decay equal to T, = 1 year or T,; = 10~4 sec., as calcu- 

lated for representative charge and mass numbers in the region of 
very heavy nuclei. All energies are expressed in Mev. 


A 250 400 600 
Z |. 90 100 | 120 140 160 180 
E(1 year) | 5-70 6-61 | 7:93 9-83 | 10-87 12-73 
(ZB) (100, 6-6) (142, 10) (194, 14) 
E(10-4 sec.) | 8:74 10:04] 11-6 14:2 | 15-2 17-6 
(Z,5 Es) (02572) (158917) (214, 22) 


allows one to estimate the energy release as an explicit function of Z 
and A: 


b= (931 Mev/A.M.U.[(20M/dZ iS AQM/OA)z ie! = Marl 


For each of the A values in Table II there has been selected that charge 
value, Z,, which gives a decay energy, Ey, such that the combination 
(Z,,£,) leads to a half life of 1 yr; similarly for (Z,,£,) and 10-4 sec. 
This procedure leads to the curves for alpha stability shown in Fig. 5. 


Neutron emission 

In the case of neutron emission it will be unreasonable to assume any 
significant centrifugal potential barrier in the typical case. Conse- 
quently the neutron stability limit is expressed in the approximation 
of the semi-empirical mass formula by the relation 


(OM(Z,A)/OA)z 4-4 = M, = 1-00893 
The corresponding curve in Fig. 5 evidently leads into a domain of 


high neutron to proton ratio, a region where the semi-empirical formula 
could be appreciably in error. 


Nuclear Fission and Nuclear Stability 183 
Beta decay 


It is appropriate to consider only normal allowed transitions in analyz- 
ing the limits for beta stability. Superallowed or favoured transitions 
will not be expected for these heavy nuclei; and forbidden transitions 
will only extend the calculated limits of stability. The data analysis of 
FEENBERG and TRIGG [24] shows that the normal allowed transitions 
are characterized by values of log, ft that lie between 4 and 5-8, where 
t is the half life in seconds and fis a dimensionless function of nuclear 
charge and of the energy release in the beta decay process. A value of 
logo ft = 4 will therefore give a conservative representation of the 
beta decay limit. The two characteristic half lives, 10-4 sec. and 1 year, 
correspond to log,9f = — 4-0 and 7-5, respectively; or log), f = 8-0 and 
— 3:5. The function f of beta energy and nuclear charge may be written 
in the form foa, where fj is a function of the energy release alone, 
AE = mc*(wy — 1), 
Wo 


Ful) =| (vo — ww? — 1) wd, 
1 


and the attraction factor, a, now measures the influence of the nuclear 
attraction in increasing the electronic wave function near the nucleus, 
averaged over the electron spectrum. The value of this factor may be 
read from Figs. 1-3 of the paper of FEENBERG and TrIGG for nuclear 
charges up to Z = 100, and extrapolated to Z = 170. 


Consider first the short life limit, 10~* sec. It implies a high energy 
release in beta decay. But the greatest release possible will occur for a 
nucleus in which the neutron excess is so great that spontaneous 
neutron emission is about to occur. Such nuclei have beta decay 
energies between 14 Mev (Z = 160, A = 612) and 17 Mev (Z = 100, 
A = 366), according to the semi-empirical mass formula. For a beta 
energy of Womc? = 30mc? the Fermi decay function, fo, has a value 
given by logig fp = 5:9. For the same energy the electron concentra- 
tion factor, a, extrapolates to a value between logjy»a = 1 and 
logio a = 2, depending upon the value of Z. Evidently one is just 
short of being able to attain a value of logy) f = logio foa equal to 8, 
as would correspond to a life of 10~* sec (lower cross hatched region 
in Fig. 5). Hence so short a beta half life is not expected for the 
superheavy nuclei that are neutron stable. 

A beta half life of one year, or more precisely, a logy) f value of 
— 3-5, will correspond to an energy release in beta decay of about 
0:02 mc? for Z = 100, and to a still smaller energy release for higher 
Z values, according to the curves of FEENBERG and TriGG. The nuclei 
that give beta energies in this general range are indicated by the upper 
cross hatched region in Fig. 5. 


184 JOHN ARCHIBALD WHEELER 


Limits of nuclear stability 


Inspection of Fig. 5 shows that fission and neutron emission dominate 
the stability analysis of very heavy nuclei. In so far as one can extra- 
polate the stability criteria for known nuclei, one is led to expect the 
existence of nuclei twice as heavy as known nuclear species, that can be 
created by massive neutron irradiation, and that will live long enough 
to be studied in the laboratory. 


REFERENCES 


[1] N. Bour and J. A. WHEELER; Phys. Rev. 56, 426, 1939 
[2] N. Bour; Phys. Rev. 55, 418, 1939 
[3] D. L. Hit and J. A. WHEELER; Phys. Rev. 89, 1102, 1953 
A. Bour; Rotational States of Atomic Nuclei, Thesis, Copenhagen, 1954 
[4] G. T. SEABorG; Phys. Rev. 85, 157, 1952 
W. J. WHITEHOUSE and W. GALBRAITH; Nature 169, 494, 1952 
[5] J. R. Hutzenca; Phys. Rev. 94, 158, 1954 
[6] A. Bonr; Dan. Mat.-fys. Medd. 26, No. 14, 1952 
[7] A. Bonr and B. R. MoTTELson; Dan. Mat.-fys. Medd. 27, No, 16, 1953 
[8] O. KLEIN; Z. Phys. 76, 226, 1932 
R. RypserGc; Z. Phys. 73, 376, 1932 and 80, 514, 1933 
[9] S. G. Nitsson; to be published 
[10] See for example F. Asaro and I. PERLMAN; Phys. Rev. 93, 1423, 1954 
[11] K. W. Forp; Phys. Rev. 90, 29, 1953 
{12] N. Bour and J. A. WHEELER; Phys. Rev. 56, 426, 1939 
[13] See FRANKEL and METROPOLIS, reference in footnote on p, 174; also D. L. HILL 
and J. A. WHEELER; Phys. Rev. 89, 1121 (Fig. 3), 1953 
[14] G. T. SzasorG; Phys. Rev. 88, 1429, 1952 
[15] J. R. Huizenca; Phys. Rev. 94, 158, 1954 
See also J. R. HuIzeNnGa and R. B. DurFieLp; Phys. Rev. 88, 959, 1952 
[16] J. RAINWATER; Phys. Rev. 79, 432, 1950 
D. L. Hit and J. A. WHEELER; Phys. Rev. 89, 1102, 1953 
[17] F. Asaro and I. PertMan; Phys. Rev. 87, 393, 1952 
[18] F. Asaro, F. STEPHENS, and I. PERLMAN; Phys. Rev. 92, 1495, 1953 
[19] R. F. Curisry; private communication 
[20] See also E. TELLER and J. A. WHEELER; Phys. Rev. 53, 778, 1938 
[21] G. T. SEABoRG; Phys. Rev. 85, 157, 1952 
W. J. WHITEHOUSE and W. GALBRAITH; Nature 169, 494, 1952 
J. R. Huizenca; Phys. Rev. 94, 158, 1954 
M. H. Srupier and J. R. Huizenca; Phys. Rev. 96, 545, 1954 
[22] S. A. Moszkowsk1 and C. H. Townes; Phys. Rev. 93, 306, 1954 
[23] See M. H. Srupier and J. R. HuIzeNGA; Phys. Rev. 96, 545, 1954 for tabulation 
of available data and references to the original literature 
[24] E. FEENBERG and G. TriGG; Rev. Mod. Phys. 22, 399, 1950 


ON THE PASSAGE THROUGH MATTER 
OF SWIFT CHARGED PARTICLES 


J. Lindhard 


Many of the basic ideas in the theory of energy dissipation by charged 
particles passing through matter are contained in two classic papers by 
Bour [9], [10], in the years 1913 and 1915. The present status of the 
subject has been surveyed thoroughly in recent review articles. In 
particular, BonR [13] has presented a comprehensive general analysis 
of the numerous types of collisions accompanying the penetration of 
charged particles, with special reference to the classical and quantal 
aspects of the phenomena. General discussions together with a detailed 
analysis of the empirical data have been given principally by BETHE 
(e.g. BETHE and ASHKIN [4]), and by ALLISON and WaRSHAW [1]. 


In the present article we do not aim at giving a similarly exhaustive 
and balanced presentation of the stopping phenomena. We shall 
merely comment on a few of the general points of view that were raised 
during the development of the theory, and enter on some recent 
investigations in this field. If we discuss the latter in more detail, it is 
only because they are less well known. 


Bohr’s classical theory of energy loss 


When the presence of electrons in atoms was known, it was realized 
that information about atomic constitution could be obtained from the 
energy dissipation by «-particles in their passage through a medium. 
Indeed, the major source of energy loss must be the collisions with the 
atomic electrons, since their small mass readily allows considerabie 
energy transfers. In the case of a free electron at rest the cross section 
for an energy transfer between T and T + dT, in a collision with a 
point charge ze of velocity v, is given by THOMSON’s formula 


ar 
2apdp = do = B. 7 (1) 
where B = 27z*e4/mv?, and p is the impact parameter. The formula 
holds for T < Tmax = 2mv?, m being the electron mass, and the maxi- 
mum energy transfer corresponding to p = 0. 


185 


186 J. LINDHARD 


Applying THomson’s formula, it is found that the average energy 
loss per unit path of the particle becomes 


dE T'max st 
ee | Q) 


where N is the number of atoms per unit volume, and a summation is 
made over the atomic electrons. The question then arises as to the lower 
limit of the integration, corresponding to the region of distant 
collisions. 


In the early treatment by THOMSON [27] it was supposed that the 
lower limit was the energy required for ionization of the atom. 
Darwin [15] made the assumption that only «-particles actually 
passing through the atom could transfer energy to the electrons. 
Though not accurate, such treatments did give information as to the 
structure of atoms and the number of electrons they contain. 


The proper limit to the energy transfer was derived in 1913 by 
BOHR [9], who made the important step of comparing the atom to a set 
of harmonic oscillators. As in the previous treatments BOHR assumed 
that in close encounters the energy transfer was given by (1). For the 
more distant collisions it was merely demanded that the force from the 
particle could be treated as a small perturbation; this is equivalent to 
a description by harmonic oscillators, since the atomic system will 
only be slightly disturbed from its state of ye nage In fact, the 
condition of a small perturbation is fulfilled for swift particles of low 
charge, such as «- and f-rays. It follows that the limit to the energy 
transfer appears when the collision time is long compared to the atomic 
period in question, I/w,. Accordingly, the limit is of an adiabatic 
type, and the binding energy is not determining for the energy loss. 
The collision time introduced here is the time during which the external 
force is appreciable; we may take it to be p/v, where p is the impact 
parameter. 


Putting the limit to the impact parameter equal to v/w, we find from 
(2) the formula of Bonr, 


dE 2v 
qR 2B: eg (3) 


where 6 is the so-called collision simian b = 2ze?/mv*. It may be 
noted that the energy loss (3) is not simply proportional to the square 
of the charge of the particle, since the charge enters in the logarithm 
too, through b. This just reveals the circumstance that in the close 
collisions a perturbation treatment is not used. 


In the stopping effect, the dynamic properties of the atom enter, 


Passage Through Matter of Swift Charged Particles 187 


according to (3), in the form of an averaged frequency w, defined by 
Z.logw = > log w,, Z being the number of electrons in the atom. 


v 
The magnitude of w,, for circular orbits in a hydrogenlike field, was 
later found by Fow Ler [17] to be 0-46 w,, where w, is the cyclic 
frequency of revolution in the orbit. 


In the paper by Bour cited above approximate estimates of the value 
of w were already given. It was ascertained that the formula (3) 
accounted fairly well for the experimental findings for «-rays, although 
the accuracy was not high. Yet, doubts arose as to the justification of 
classical mechanics in these dynamical problems, and it was recog- 
nized [11] that one had to be prepared for specific quantum effects 
associated with the limitation of the use of orbital pictures in such 
collision phenomena. 

A special difficulty appeared for distant collisions, where the classical 
energy transfer was extremely small compared to e.g. the energy of 
ionization. However, when applying the new wave mechanical des- 
cription to distant collisions GAUNT [18] found no essential deviations 
from the classical treatment. 


The quantum theory of stopping 

A systematic application of the quantal perturbation treatment to 
all collisions, the distant as well as the violent ones, was made by 
BETHE [2] in 1930. BrTHE derived the formula 


dE 
dR 
where the difference from (3) is that the wave-length of an electron with 
velocity v replaces the collision diameter b = 2ze?/mv*. The ratio of 


these two quantities, «, is a number characterizing the strength of 
interaction in the collision: 


~ 2mv? , 
= 2B.N.Zlog— Ziloga— > > f,losw., (4) 
tok 


2ze" 
hv’ 


and it will be small compared to unity for sufficiently low charges or 
high velocities. If we take into account the dynamic character of the 
stopping problem, so that the atomic properties enter only through o, 
« will be the only dimensionless quantity in the encounter containing 
the interaction ze”. It is then clear from dimensional arguments that 
the transition from the classical stopping formula to the quantal 
perturbation formula must consist in multiplication by « inside the 
logarithm, because a perturbation formula is proportional to z*e*. 


The somewhat smaller stopping resulting from BETHE’s formula 


(5) 


188 J. LinDHARD 


gave distinctly better agreement with «-ranges than did the classical 
formula (3). As to the relative merits of the classical and quantal 
formulas a theoretical clarification was achieved soon after, particularly 
through the discussions by BLocH, BoHR and WILLIAMS (cf. the later 
review articles by WILLIAMS [30], and Bour [13]). 


Biocu [5] carried through a quantal calculation of the motion of a 
wave packet in the Coulomb field of a point charge. By use of the 
indeterminacy relations only, it was similarly shown by Bour that one 
can introduce classical orbital pictures when « > 1, and thus the inter- 
action large in magnitude. When « < 1], the classical description fails, 
and a quantal perturbation calculation is appropriate. Still, in the 
latter case one may use the classical concept of impact parameter in 
distant collisions, when the particle passes outside the atom. As 
regards the energy transfer, however, one may only use the classical 
value of the cross section for violent collisions, but the connection 
with the impact parameter is lost. This circumstance shows that the 
straggling in energy loss, as governed by high energy transfers, will be 
essentially given by the classical formula derived by Bour [9], also for 
K< 1. 


While thus, in idealized cases, a solution had been found of the 
paradoxes met with, the more quantitative aspects of the stopping 
theory were in need of further development. For « < 1, in hydrogen 
and helium, the formula of BETHE gave accurate predictions of the 
energy loss and range of heavy particles. However, for heavier sub- 
stances the situation was less satisfactory, and the simple applications 
of BETHE’s formula resulted in values of the average excitation potential 
which were too low. 


Essential progress in this respect was made in 1933 when BLocu [6] 
applied the Thomas-Fermi model to the dynamics of an atom. The 
Thomas-Fermi statistical model must of course lead to a common 
formula for the heavier atoms. In fact, it is sufficient to note that the 
unit of time in the Thomas-Fermi atom is proportional to Z~'; accord- 
ingly, m and the average excitation potential become proportional to Z, 
Le. hw = I= Z.I). However, it was found difficult to calculate the 
proportionality constant J). Experiments at high energies—protons of 
100 MeV or more—are in good accord with BLocu’s simple relation, 
and the empirical value of Jy is about 10 eV, or slightly less. For the 
individual elements there are, of course, minor deviations from this 
average, due to the differences in the binding of the outermost atomic 
electrons. 


For a particle moving with relativistic velocity, a two-fold correction 
appears in the energy loss. First, although in the distant collisions the 


Passage Through Matter of Swift Charged Particles 189 


average energy transfer is still given by the non-relativistic result, yet 
the adiabatic limit itself is shifted. The reason for this is that the field 
acting on an electron is Lorentz contracted, so that the collision time is 
shorter. This effect, calculated originally by Bour [10], gives a contri- 
bution B.N.Z.(— log (1 — v?/c”) — v*/c*) in (4). A second relativ- 
istic effect must appear in the violent collisions, since the struck 
electron may attain relativistic velocities. Using the Born approxima- 
tion one finds that for a heavy penetrating particle the extra contribu- 
tion here is equal in magnitude to that for distant collisions (BETHE [3]; 
M@eLLeR [25]). There is, furthermore, a relativistic correction to the 
straggling in energy loss and in range [23]. 

If the penetrating particle is an electron, its energy loss in a single 
collision need not be small compared to its total energy. Moreover, 
the penetrating electron will be strongly deflected in the field of the 
atomic nuclei. One thus gets a wide distribution in energy loss and in 
the penetration. The proper statistical methods for dealing with such 
problems were introduced by Bonr [10]. Accurate distributions of the 
energy loss were derived by WILLIAMS [29], LANDAU [22] and other 
authors. If one regards the stopping as a means of obtaining informa- 
tion concerning the properties of atoms, one should, however, turn to 
particles heavier than electrons. 


Collective effects 


Tn evaluating the stopping power of a substance it is in general not 
possible to consider the individual electrons as bound in a static 
potential. The force from the particle produces a displacement of the 
electrons in the substance, and this displacement gives rise to a strong 
induced field. The resulting reduction in the total field acting on an 
electron is in most cases of importance. 

In order to study this effect, let us consider a gas of free electrons, 
where the absence of binding implies that an adiabatic cut-off can only 
be due to polarization. The stopping in a free gas was derived only 
recently, by KRAmERS [20] (cf. also [8]), although earlier attempts had 
been made, and the result may be said to be implicitly contained in a 
treatment by Ferm [16]. 

It is well known that in a gas of free electrons oscillations can occur, 
with the frequency wy = (4mpe?/m)1/, where p is the density of electrons. 
This phenomenon is of purely classical type, and atomicity of mass 
and charge need not be assumed. The oscillations of an electron in the 
gas will thus interfere destructively with the force from the particle 
when the collision time p/v is longer than the oscillation time w,'. 
The stopping of the penetrating particle is therefore completely as for 
an harmonic oscillator, and in (4) one need merely replace w by a9; 


190 J. LinDHARD 


at the same time one must of course in (4) interpret N . Z as the density 
of electrons, p. This result emphasizes the dynamic character of the 
average excitation potential. 

It is apparent that the polarization must play a role not merely for a 
free gas, but also in atoms containing many electrons, where even 
in the static case the field is determined by the electron charge distri- 
bution. For heavier atoms one will employ the Thomas-Fermi model, 
and in this case just the treatment of BLOCH, mentioned above, involved 
both the polarization effects and the static binding in the atomic field. 
The combined effect is approximately to give a total average excitation 
frequency w ~ (w2 + w3)/?, which is greater than both the polariza- 
tion frequency and the harmonic oscillator frequency, w,, of the static 
binding. Since the two frequencies are approximately equal, neglect 
of the polarization leads to an underestimate of the average excitation 


potential. 


When the above effects are included one finds that, for penetrating 
particles of very high velocity, it is possible to account fairly well for 
the experimental stopping powers, and at least no serious disagreement 
seems to occur. It appears possible to evaluate e.g. the small variations 
of J about the Bloch value Jy . Z, due to the effect of the most loosely 
bound electrons, and similarly the minor corrections for chemical 
bindings. 


Stopping at lower velocities 


The interest focuses then on somewhat lower particle velocities, where 
the collisions aquire an adiabatic character for the more strongly 
bound electrons, and the stopping no longer behaves as in (4). One 
may, of course, still retain the form (4), if so desired, and then say that / 
varies with the particle velocity, for lower velocities. Now, it is 
important to note that the deviations set in smoothly already at high 
particle velocities. This can be seen from the case of hydrogen, or a 
gas of free electrons, where the behaviour of the energy loss can be 
evaluated rather easily. The deviations from BETHE’s formula at 
moderate particle velocities gave rise to ambiguities at first, when one 
tried to make empirical estimates of the atomic average excitation 
potentials. 


BETHE gave a correction to atomic stopping power due to the K- 
electrons only. However, if we ask for the behaviour for heavier atoms, 
and down to quite low particle velocities, where the contributions of most 
of the atomic electrons are reduced, a Thomas-Fermi treatment should 
again be preferable, giving a natural extension of BLOcCH’s relation. 


If this reduction is of simple adiabatic type, it can be found from a 


Passage Through Matter of Swift Charged Particles 191 


comparison of the atomic transition frequencies and the frequencies 
in the collision, the former being proportional to Z for the Thomas- 
Fermi atom, and the latter proportional to 2mv?/h. Instead of 
log (2mv*/Z . Ig) in the Bethe-Bloch formula one must accordingly find 
a function of the type L = L(x), where x is the dimensionless variable 
v®/Z . vj, and vy = e*/h, [23]. Thus, one has the advantage that stop- 
ping in different substances can be directly compared, and should give 
a common curve. If, for quite low velocities, the stopping is propor- 
tional to 1/v (GEIGER’s formula), it will also vary as Z'/?. 


The measurements on stopping at low velocities, e.g. protons of 
1 MeV or less, are much more difficult to perform than those at high 
velocities, because of the higher stopping powers involved. Most of the 
experiments have been done during the last years; a review of such 
work was published by ALLISON and WaARSHAW [1]. The data seem 
in fair accord with the above simple approximation based on the 
Thomas-Fermi atom. 


The function L(v?/Z . v2) is, over a considerable interval, rather lower 
than the Bethe-Bloch expression, but rises above it for sufficiently low 
velocities. The behaviour of L has not been tested accurately in the 
intermediate region where it is smallest compared to the Bethe-Bloch 
expression. This variation of L is important, not only for the stopping 
in the region in question, but also for the range-energy data at much 
higher particle energies. 


Description by Maxwell equations in matter 


An approach where one considers the stopping process in terms of 
collisions between the particle and individual electrons does not directly 
show the connection with other electromagnetic properties of the sub- 
stance. Moreover, it is not quite adequate when the polarization 
effects are of significance. Let us then ask, as an alternative, in how far 
one is able to account for the phenomena in terms of Maxwell equations 
in matter. 


In the latter kind of description one attempts to solve the problem 
on classical lines, since the Maxwell equations are concerned with a 
classical field. The stopping power is here to be considered as the 
average energy dissipation when the particle moves a considerable 
distance with nearly constant velocity. It may be obtained from the 
Maxwell equations by deriving either the classical electric field acting 
on the particle, or the current of energy transported away from the 
particle track. The fluctuations come about only when quantum theory 
is taken into account for the Maxwell field, since the energy and 
momentum dissipation is then no longer continuous. 


192 J. LINDHARD 


The first application of the macroscopic Maxwell equations in matter 
was made in 1937 by FRANK and Tamm, cf. TAMM [26]. These authors 
studied that part of the energy loss which goes off in form of coherent 
radiation, as discovered some years earlier in experiments by CERENKOV. 
FERMI [16] gave a detailed account of the manner in which the energy 
loss in distant collisions can be described by Maxwell equations. In 
this way he found a peculiar saturation of the energy loss in distant 
collisions, for particle velocities approaching the velocity of light. This 
phenomenon was studied later by several other authors (cf. a review 
article by UEHLING [28)). 

In order to discuss the possibility of using the Maxwell equations in 
matter we may quote the formula for the energy loss, as obtained from 
these equations in a straightforward manner. One gets for a heavy 
particle moving with nearly constant velocity, by finding the electric 
field acting on it, 


ea 
dE , ze ~ vue 
aR aa kak [ pple Bot (6) 

meee € 

C 


where ¢ is the dielectric constant, and yu the —— permeability. 
The quantity w is the cyclic frequency of the field wave, and k the 
numerical value of the wave vector. 

If we put w = 1 in (6), and e = 1 + w2/(w? — w(w + i/7)), we get just 
the part of the energy loss considered by FERMI [16], due to the effects at 
some distance from the particle track. The contribution hereto from 
the transverse field is the Cerenkov radiation. If, instead, we write 
€ = 1 — wi/a(@ + i/7), we obtain the stopping due to distant collisions 
in a free gas of electrons. But it is to be noted that in such calculations 
one must in (6) make a cut-off for large k, and compute the effects of 
close collisions by other methods. This kind of division has been 
introduced also in calculations of other polarization phenomena (e.g., 
Boum and Pings [7]); it is hardly well adapted to a systematic linear 
approach. The division corresponds to a sharp discrimination between 
collective and single particle effects, which is an unwarranted approxi- 
mation. 

In point of fact the formula (6) has a much wider scope if we allow 
é and yu to be more general integral operators, not merely in time, but 
also in space, i.e. € = e(k,w), ws = p(k,w). The only assumption we 
then maintain is that they are linear operators. As was to be expected, 
it turns out that (6) now represents the total energy loss by the particle, 
including the close collisions [24]. Thus, the linear Maxwell equations 


Passage Through Matter of Swift Charged Particles 193 


in matter embrace all dissipation effects, for a particle having a charge 
sufficiently small for perturbation theory to be applied. 


The form in which the energy loss is expressed in (6) allows a more 
general discussion of the phenomena. We may note that the imaginary 
and real parts of the dielectric constants must of course be directly 
connected, in the manner outlined by KRAmERS [19]. In particular, one 
can derive the real part of ¢ from its imaginary part, and vice versa, 
if e takes account of momentum and energy balance. Similarly, 
when we have given the energy loss of a particle of the form (6), which 
expression is directly connected with the imaginary part of the energy 
of the particle, we can immediately find its self-energy in the medium [24]. 


In connection with the general properties of the field equations it 
may be noted that the integral operator 1/u will usually be different 
from unity. In that case one cannot, e.g. from the observed polariza- 
tion or absorption of the transverse field, make a direct semi-empirical 
estimate of the corresponding quantities for the longitudinal field. 
Caution should thus be observed in attempts to derive stopping powers 
from X-ray measurements. 


It is interesting to observe the similarity between the use of linear 
Maxwell equations, and Bour’s original treatment of the energy 
dissipation [9]. BOHR supposed merely that an atom could be compared 
to a set of harmonic oscillators. Just the same assumption is made 
when linear Maxwell equations in matter are introduced, only the atom 
is replaced by a large system of atoms. The linear Maxwell equations 
contain the further simplification that even the close collisions are 
treated by perturbation methods. The particular advantage of the field 
treatment is that one does not divide the slowing down into separate 
collisions, and concepts like an impact parameter become unnecessary. 
The classical field treatment is then possible where the classical 
description by electron trajectories fails. 


Highly charged ions 
Through the discovery of uranium fission it became possible to investi- 
gate the penetration of highly charged nuclear fragments, having initial 
velocities comparable to those of a-rays. The penetration of such ions 
is accompanied by an abundance of new phenomena, the discussion of 
which was taken up in 1940 by Bour [12], and Lamp [21]. The 
stopping problem should here be of essentially classical type since, 
from (5), the number « is found to be large compared to unity. 

An immediate problem is the magnitude of the ion charge. When 
« > 1 the ion carries a number of electrons, and has a total charge z* 
of the order z* = 2'/3 . v/v, [12]. This charge value comes about since 


194 J. LINDHARD 


electrons bound to the ion with orbital velocities less than v are easily 
removed from it in collisions with atoms. Several interesting and more 
detailed properties were disclosed in measurements of the total 
charges of fission ions in various media. Thus, the charge of an ion 
moving through a gas depends on the gas density, increasing slightly 
with the density. For ions emerging from solids the charge is con- 
siderably higher than in gases. These features have been ascribed 
to the competition between the life-times of excited ion states and the 
capture and loss processes [14]; the closer analysis showed that the 
above charge value, z1/* . v/vy, should hold accurately only for penetra- 
tion through heavier gases at low densities. 


The classical formula (3) assumes a point charge for the particle. 
Therefore, if one introduces the total ion charge z* in this formula, 
one must make a correction for the closest collisions, where the electron 
passes through the strong field inside the ion core. A further modifica- 
tion is required for such highly charged ions, since the force from the 
ion need not be small even in the distant collisions. Thus, also outside 
the usual adiabatic limit v/w, the electron may be pulled out of the 
atom [21]. From a more general point of view this type of stopping 
therefore leads to a typical non-linear problem. Some uncertainty 
prevails in the description of stopping of highly charged ions; it is 
probably due to the complications referred to. 


The measurements on slowing down of highly charged ions are 
mostly on fission fragments. It is expected, however, that more precise 
studies will soon be made with accelerated ions. At present, the 
general indications are that, in heavier substances, the stopping is 
approximately proportional to the square root of the atomic number 
of the substance. It is remarkable that no significant difference between 
gases and solids is observed, although the charge measurements gave 
rather different results in these two instances. The lack of difference in 
stopping may perhaps be ascribed to the strong polarization in solids, 
so that in solids as many electrons accompany an ion as in gases, only 
several of these electrons are not in bound states. 


It is further found that, in a considerable velocity interval, the 
stopping is proportional to the ion velocity. This behaviour is 
expected from simple considerations [13]. One might also make a 
comparison with protons of the same velocity, where the stopping 
behaves as Z¥?/v, except for the lightest substances. Multiplying by 
z™ ~ 22/3 lug, one is led to an energy loss proportional to 
2/3 | Z/2 |, in fair accord with the measurements. 

As pointed out by Bour [12], [13], another novel feature of the stop- 
ping of highly charged ions is the important role played by nuclear 


Passage Through Matter of Swift Charged Particles 195 


collisions. In fact, for velocities less than ~ vg nuclear stopping takes 
over; for fission ions this final part of the range remains an appreciable 
fraction of the total range. The nuclear stopping can be estimated on 
purely classical lines, since the number 22Ze?/fv is much larger than 
unity. 

For penetrating particles having « < 1, so that a perturbation 
treatment is proper, the linear Maxwell equations gave a convenient 
unified description of the phenomena. The question arises whether the 
Maxwell equations can be of use in the present case of highly charged 
particles. It is apparent from the above that the strong forces involved 
demand the application of non-linear Maxwell equations in matter. 
A systematic approach on these lines appears not to have been made. 
It seems desirable to develop such descriptions in close connection with 
a dynamical extension of the Thomas-Fermi equation. 


REFERENCES 


[1] ALLIson and WarsHAw; Rev. Mod. Phys. 25, 779, 1953 
[2] BetHe; Ann. d. Phys. (5) 5, 325, 1930 
[3] BeTHe; Z. Phys. 76, 293, 1932 
[4] BeTHE and AsHKIN; Experimental Nuclear Physics I, Chapt. 2, John Wiley and 
Sons, New York, 1953 
[5] BLocn; Ann. d. Phys. (5) 16, 285, 1933 
[6] BLocu; Z. Phys. 81, 363, 1933 
[7] Boum and Pines; Phys. Rev. 85, 338, 1952 
[8] Bour, A.; Dan. Mat.-fys. Medd. 24, No. 19, 1948 
{9] Bour; Phil. Mag. (6) 25, 10, 1913 
[10] Bour; Phil. Mag. (6) 30, 581, 1915 
[11] Bour; Z. Phys. 34, 142, 1925 
[12] Bour; Phys. Rev. 59, 270, 1941 
[13] Bour; Dan. Mat.-fys. Medd. 18, No. 8, 1948 
[14] Bose and LinpHarD; Dan. Mat.-fys. Medd. 28, No. 7, 1954 
[15] Darwin; Phil. Mag. (6) 23, 901, 1912 
[16] Fermi; Phys. Rev. 57, 485, 1940 
[17] FowLer; Proc. Camb. Phil. Soc. 22, 793, 1925 
[18] Gaunt; Proc. Camb. Phil. Soc. 23, 732, 1927 
[19] Kramers; Como Conference, 1927 
[20] Kramers; Physica 13, 401, 1947 
[21] Lams; Phys. Rev. 58, 696, 1940 
[22] Lanpau; J. Phys. U.S.S.R. 8, 201, 1944 
[23] LinDHARD and ScHARFF; Dan. Mat.-fys. Medd. 27, No. 15, 1953 
[24] LrnpHarD; Dan. Mat.-fys. Medd. 28, No. 8, 1954 
[25] Mo@ier; Ann. d. Phys. (5S) 14, 531, 1932 
[26] Tam; J. Phys. U.S.S.R. 1, 439, 1939 
[27] THomson; Phil. Mag. (6) 23, 449, 1912 
[28] Urniinc; Ann. Rev. Nucl. Sci. 4, 315, 1954 
[29] Witttams; Proc. Roy. Soc. A 125, 420, 1929 
[30] Wittiams; Rev. Mod. Phys. 17, 217, 1945 


no | 


Vint iG 
ms(Ty 


NOWTHWegT 
LI Tix AS RY 


wAMh 


WAU 


seeieaes 


Sees! 
eae 


as 


